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ABSTRACT 


Remote  sensing  of  the  oceans  via  satellites  is  providing  useful  data  that  can  be  used  for  realtime 
input  into  and  verification  of  the  numerical  ocean  models.  To  make  an  optimum  use  of  these  data, 
efficient  methods  are  being  developed  to  handle  vast  amounts  of  data  and  provide  their  quick  analyses 
and  summaries  in  the  form  of  meso scale  features. 


Identification,  isolation  and  tracking  of  meso  scale  features  plays  an  important  role  in  numerical 
ocean  modeling.  Of  late,  there  has  been  considerable  interest  in  designing  algorithms  to  automatically 
detect  such  oceanographic  features  as  temperature  fronts  and  eddies.  This  report  provides  a  litera¬ 
ture  review  of  recent  approaches  and  efforts  on  objective  feature  identification  (OFI)  as  it  pertains  to 
oceanographic  applications.  Most  of  the  OFI  work  in  oceanography  has  been  done  on  characterizing 
the  Gulf  Stream  (GS);  and  since  the  GS  incorporates  all  the  meso  scale  feature  complexities  that  one 
may  desire  to  resolve,  the  literature  reviewed  here  pertains  to  this  feature  entirely.  It  is  felt  that  the 
work  cited  is  quite  representative  and  is  applicable  to  other  geographical  features  of  interest. 

Feature  identification  from  satellite  data  can  be  viewed  as  a  four-step  process:  (1)  edge  detection 
-  analyzes  pixels  in  satellite  images  for  frontal  boundaries;  (2)  edge  labeling  -  assigns  feature  labels 
to  frontal  pixels;  (3)  spatial  interpolation  -  fills  spatial  gaps  in  labeled  features  due  to  clouds  or  data 
sparsity;  and  (4)  expert  system  -  provides  additional  interpolation  over  space  and  time  based  on  knowl¬ 
edge  of  the  region  dynamics.  Most  research  efforts  are  focused  on  algorithm  development  for  individual 
steps.  Considerable  effort  is  required  to  put  together  a  complete  system  that  can  go  from  raw,  satel¬ 
lite  imagery  to  a  finished  product  in  terms  of  digitized  information  on  frontal  location  and  dynamics. 
The  Navy’s  Semi-Automated  Mesoscale  Analysis  System  (SAMAS)  is  perhaps  the  only  system  that  has 
combined  these  steps  in  a  modular  approach.  The  only  other  image  analysis  system  is  that  of  the  Uni¬ 
versity  of  Rhode  Island  (URI),  which  emphasizes  the  removal  of  clouds  from  AVHRR  images  to  enhance 
edge  detection. 


This  review  describes  how  different  algorithms  and  approaches  are  being  used  in  objective  feature 
identification.  After  reviewing  algorithms  for  the  four  steps  individually,  brief  descriptions  of  SAMAS 
and  the  URI  image  analysis  system  are  provided.  Finally,  methods  of  feature  extraction  and  eddy 
tracking  from  model  output  are  described.  The  Center  for  Air  Sea  Technology  (CAST)  has  implemented 


these  algorithms  for  oceanographic  visualization. 
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1.  INTRODUCTION 

Because  of  the  ever-increasing  amount  of  satellite- 
derived  data  available  and  because  computers  have 
become  faster  and  less  expensive,  there  is  a  ris¬ 
ing  interest  in  designing  algorithms  to  automati¬ 
cally  detect  oceanographic  features  such  as  temper¬ 
ature  fronts  and  eddies.  This  paper  provides  a  lit¬ 
erature  review  of  recent  approaches  and  efforts  on 
objective  feature  identification  (OFI)  pertaining  to 
oceanographic  applications. 

Characterization  of  ocean  front  and  eddy  fea¬ 
tures  plays  an  important  role  in  numerical  ocean 
modeling.  For  instance,  the  objectively-determined 
location  and  size  of  an  eddy  can  be  directly  assimi¬ 
lated  into  a  numerical  ocean  model  using  a  ‘feature’ 
model  (Robinson  and  Walstad,  1987).  The  location 
of  the  North  Wall  (NW)  of  the  Gulf  Stream  (GS)  is 
used  to  separate  water  masses  for  input  to  the  Op¬ 
timum  Thermal  Interpolation  System  (OTIS)  analy¬ 
ses  of  the  GS  region  (Cummings,  1991).  Such  OTIS 
analyses,  using  the  GS  North  Wall  (GSNW)  location, 
were  used  for  model  initialization  in  DAMEE-GSR 
(Data  Assimilation  and  Model  Evaluation  Experi¬ 
ments  -  Gulf  Stream  Region);  see  Perkins  (1993)  for 
details. 

Objectively-determined  ocean  features  also  play 
an  important  role  in  the  objective  evaluation  of  nu¬ 
merical  ocean  models.  Often,  the  performance  of  a 
numerical  model  is  defined  in  terms  of  its  capabil¬ 
ity  to  reproduce  certain  observed,  quantifiable  fea¬ 
tures.  Model  output  is  analyzed  to  quantify  the  char¬ 
acteristic  feature  and  compared  with  the  objectively- 
derived  feature  from  observations.  In  DAMEE-GSR, 
the  GSNW  location  has  been  used  for  an  assessment 
of  ocean  models  prediction  performance  (Perkins, 
1993;  Fox  et  al.,  1991). 

For  feature  extraction  from  satellite  observa¬ 
tions  or  three-dimensional  model  output,  computer 
based  algorithms  can  be  used  to  exploit  the  physi¬ 
cal  characteristics  of  the  observations.  The  develop¬ 
ment  of  such  algorithms  requires  that  a  feature  be 
specified  formally  in  terms  of  its  defining  character¬ 
istics.  Given  a  set  of  sample  values  of  a  scalar  field 
/(*,  y,  z)  over  a  volume,  a  feature  is  defined  to  be  a 


region  in  the  volume  that  has  similar  characteristics 
(Moorhead  and  Zhu,  1993).  Formally,  if  the  volume 
V  is  constituted  by  functional  values  of  /  that  are 
bounded,  then  a  feature  may  be  defined  as  follows: 

»  a  tempo  rally-dynamic  region  H  G  V  with  all 
sample  values  belonging  to  a  specified  range  Q, 
or 

•  a  region  We  V  with  sample  values  significantly 
different  from  the  neighboring  regions. 

In  the  first  case,  features  are  defined  in  terms 
of  the  original  sample  values;  the  computer  extracts 
the  subset  H  easily  in  terms  of  the  defining  char¬ 
acteristic  specified  by  Q.  A  well-known  example  of 
such  a  feature  is  the  subsurface  GSNW  defined  as 
the  15  deg  isotherm  at  200  m  depth.  This  subset 
is  easily  extracted  from  a  three-dimensional  grid- 
ded  model  output.  In  the  second  case,  features  are 
not  defined  by  the  original  sample  values  but  by  the 
region  edges.  This  property  characterizes  features 
as  only  those  regions  that  exhibit  significant  differ¬ 
ences  from  the  surrounding  values;  or  equivalently, 
these  regions  can  be  identified  because  of  the  sharp 
edges  (boundaries). 

According  to  Lybanon  and  Holyer  (1991),  gener¬ 
ation  of  oceanographic  products  from  the  interpre¬ 
tation  of  satellite  data  is  a  three-level  image  reso¬ 
lution  paradigm:  (1)  edge  detection,  (2)  edge  label¬ 
ing,  and  (3)  feature  construction.  The  edge  detec¬ 
tion  level  is  the  lowest,  which  works  at  the  pixel 
level  to  derive  feature  edges  or  boundaries.  These 
edges  are  disjoint.  To  join  or  fill  in  these  disjoint 
edges,  the  next  step  is  to  label  them  correspond¬ 
ing  to  the  features  they  represent.  The  final  step 
is  to  fill  in  the  gaps  using  spatial  interpolation.  In 
addition  to  these  three  levels,  Lybanon  and  Holyer 
(1991)  include  a  fourth  level,  viz.,  the  expert  sys¬ 
tem,  which  provides  temporal  interpolation  during 
time  periods  of  sparse  observations.  The  expert  sys¬ 
tem’s  rule  base  represents  oceanographic  knowledge 
about  the  evolution  of  mesoscale  ocean  features  in 
terms  of  their  kinematics.  These  four  steps  form 
the  Four-Tiered  Approach  for  the  development  of  the 
Navy’s  Semi-Automated  Mesoscale  Analysis  System 
(SAMAS)  (Peckinpaugh  and  Holyer,  1991). 
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Considerable  work  has  been  done  in  each  of 
the  above  categories  separately;  however,  there  has 
been  a  notable  lack  of  effort  in  putting  an  overall 
system  together  that  is  capable  of  systematically 
converting  raw  images  of  satellite  data  into  a  fin¬ 
ished  picture  which  delineates  all  mqjor  features. 
In  oceanographic  applications,  there  are  two  dif¬ 
ferent  systems  that  are  in  the  forefront.  One  is 
the  Navy’s  SAMAS  (Holyer  and  Peckinpaugh,  1990; 
Krishnakumar  et  al.,  1990a)  in  which  most  of  the 
components  are  well  identified  and  tested.  The  ex¬ 
pert  system  component  still  needs  to  be  developed. 
The  other  system  is  being  developed  at  University 
of  Rhode  Island  (URI).  URI  has  performed  consid¬ 
erable  analyses  of  the  satellite  AVHRR  images,  but 
it  has  been  limited  to  providing  labeled  edges  of  the 
features. 

In  the  edge  labeling  component,  SAMAS  uses 
the  Relaxation  Labeling  Algorithm,  which  is  an 
iterative  procedure  based  on  Bayes’  Theorem 
(Krishnakumar  et  al.,  1990a).  Correspondingly, 
URI  has  the  Contour-Following  Algorithm,  which 
restricts  the  rate  of  change  with  which  the  contour- 
curvature  can  vary  spatially.  For  spatial  interpo¬ 
lation,  SAMAS  employs  an  algorithm  developed  by 
Molinelli  and  Flanigan  (1987)  which  is  based  on 
complex  empirical  orthogonal  functions.  Evidently, 
there  are  some  gaps  beyond  this  stage  that  need  to 
be  filled.  The  review  will  be  cast  in  terms  of  the  com¬ 
ponents  of  the  two  systems.  It  will  include  some 
algorithms  which  could  possibly  indicate  direction 
for  the  development  of  the  expert  systems  using  the 
space-time  interpolation  concept. 

The  intent  of  this  publication  is  to  show  how  dif¬ 
ferent  approaches  and  modules  can  be  put  together 
for  the  OFI  process  by  reviewing  a  few  typical  al¬ 
gorithms  from  each  category.  The  discussion  brings 
out  the  essence  of  each  approach  by  including  all  rel¬ 
evant  details  of  the  illustrating  algorithm  chosen. 
This  will  provide  the  reader  with  a  full  flavor  of  the 
capabilities  as  well  as  the  intricacies  of  the  proce¬ 
dures.  However,  the  review,  by  design,  is  not  com¬ 
prehensive. 


Most  of  the  techniques  appearing  in  the  liter¬ 
ature  on  objective  identification  of  ocean  features 
are  based  on  satellite  observations  of  sea-surface 
temperature  (SST)  and  sea-surface  height  (SSH)  in¬ 
formation  from  GEOSAT  and  pertain  to  the  Gulf 
Stream  region.  Section  2  gives  a  description  of 
the  two  types  of  satellite  data,  followed  in  Section 
3  by  a  few  conventional  definitions  of  the  terms 
used.  Section  4  then  describes  the  edge  detection  ap¬ 
proaches  including  the  Cluster  Shade  Algorithm  due 
to  Holyer  and  Peckinpaugh  (1989)  and  the  URI  al¬ 
gorithm  which  is  based  on  objective  statistical  tests 
for  sequential  decision  making  (Comillon  and  Watts, 
1987;  Cayula  and  Comillon,  1992).  Section  5  cov¬ 
ers  the  feature  labeling  approach  and  basically  in¬ 
cludes  the  Relaxation  Labeling  Algorithm  adopted 
by  the  Navy  (Krishnakumar  et  al.,  1990a).  The  spa¬ 
tial  interpolation  algorithms  are  described  in  Sec¬ 
tion  6.  The  section  reviews  the  complex  empirical 
orthogonal  functions  (CEOF)  approach  of  Molinelli 
and  Flanigan  (1987),  the  Pathfinder  algorithm  (Hor¬ 
ton,  1989)  and,  more  importantly,  includes  the  de¬ 
scription  of  work  on  Bpace-time  interpolation  (Mari¬ 
ano,  1990;  Chin  and  Mariano  1993). 

With  this  development  at  hand,  we  give  in  Sec¬ 
tion  7,  a  brief  description  of  the  Navy  and  the  URI 
systems  in  their  modular  form.  As  a  complete  sys¬ 
tem,  it  is  pertinent  to  include  the  approach  of  dy¬ 
namic  interpolation  where  a  numerical  model,  ca¬ 
pable  of  reproducing  the  feature  of  the  region,  in¬ 
gests  observations  (data  assimilation)  and  predicts 
the  required  location  of  the  feature.  Finally,  in  Sec¬ 
tion  8  we  describe  an  effort  of  feature  extraction 
from  4-D  model  outputs  available  on  a  regular  grid. 
The  Center  for  Air  Sea  Technology  has  implemented 
these  techniques  in  a  visualization  case  study  us¬ 
ing  the  DieCAST  model  for  the  Gulf  of  Mexico  (Di¬ 
etrich  et  al.,  1993).  The  three-dimensional  features 
of  the  eddies  extracted  are  sharp,  and  these  eddies 
are  tracked  well  as  they  traverse  toward  the  western 
Gulf. 

2.  SATELLITE  OBSERVATIONS 

Apart  from  satellite  data,  observations  of  the 
ocean  are  quite  sparse  and  inadequate  to  com- 
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pletely  determine  mesoscale  ocean  features.  There¬ 
fore,  investigators  must  depend  on  satellite  obser¬ 
vations  that  are  abundantly  available.  For  their  op¬ 
timal  utilization  in  feature  identification  and  now¬ 
cast/forecast  activity,  their  inherent  characteris¬ 
tics,  advantages  and  drawbacks  must  be  understood 
fully. 

Altimetry  from  satellites,  like  GEOSAT,  pro¬ 
vides  global  coverage.  However,  these  data  are 
available  only  along  satellite  tracks  that  are  sep¬ 
arated  by  distances  comparable  to  the  dominant 
scales  in  the  GS  regime.  Another  problem  associated 
with  the  use  of  altimetry  is  that  the  relationship  of 
the  measured  SSH  to  the  geoid  is  unknown.  Sub¬ 
traction  of  the  time-averaged  altimetry  data  from 
the  satellite  track  data  removes  the  geoid,  but  the 
mean  ocean  circulation  signal  is  removed  as  well.  In 
spite  of  these  limitations,  satellite  SSH  data  have 
proved  useful  in  data  assimilation  studies  and  in  the 
determination  of  sea  surface  variability. 

Multichannel  SSTs  from  satellite  IR  images  pro¬ 
vide  global  coverage  of  the  World  Ocean.  The  im¬ 
ages  can  be  used  subjectively  or  objectively  to  lo¬ 
cate  the  position  of  the  GSNW.  However,  SST  data 
may  be  incomplete  due  to  cloud  cover,  and  atmo¬ 
spheric  moisture  contamination  may  not  be  totally 
corrected.  Year-long  averages  for  1986  indicate  that 
only  50  percent  of  the  Gulf  Stream,  50  percent  of 
the  warm  rings,  and  25  percent  of  the  cold  rings 
were  visible  in  the  satellite  IR  (Perkins,  1993).  Even 
with  a  carefully-culled  data  set,  which  is  composed 
of  time  periods  with  good  IR  coverage,  it  was  neces¬ 
sary  to  build  composite  data  over  ±2  days  to  define 
the  GSNW  and  ring  locations. 

3.  DEFINITION  OF  GULF  STREAM  FRONT 

The  Gulf  Stream  front,  depending  on  data  type 
or  the  application,  is  defined  in  several  ways.  Using 
SSH  and  SST,  the  feature  has  been  defined  as 
the  surface  NW.  From  AVHRR  imagery  data,  it  is 
defined  as  the  maximum  temperature  gradient  on 
the  north  side  of  the  stream,  except  for  shingles. 


While  using  GEOSAT  SSH,  the  surface  GSNW  is 
estimated  as  the  northern  region  of  abrupt  change 
in  slope  to  near  zero  of  the  SSH  that  crosses  the 
stream.  The  estimation  error  is  assumed  to  be  of  the 
order  of  ±14  km.  Another  definition  of  the  feature 
is  the  Surface  Axis,  which  is  the  locus  of  maximum 
downstream  velocity.  The  subsurface  axis  is  defined 
as  where  the  12°  C  isotherm  crosses  500  m  depth. 

For  practical  applications  to  feature  models  and 
OTIS,  the  often-used  criterion  is  the  subsurface 
NW  location,  defined  as  the  region  where  the  15°  C 
isotherm  crosses  200m.  Since  satellite  observations 
can  provide  only  surface  definition,  the  subsurface 
characterization  may  be  determined  in  two  ways: 
Use  the  statistical  relationship  between  the  surface 
and  subsurface  locations  of  the  fronts;  this  relation¬ 
ship  is  derived  from  empirical  data  available  simul¬ 
taneously  on  the  front  locations  at  the  two  levels.  Or, 
employ  statistical-dynamical  interpolation  (data  as¬ 
similation);  this  approach  dynamically  transfers  in¬ 
formation  from  observed  data  to  the  entire  modeling 
domain. 

4.  EDGE  DETECTION  ALGORITHMS 

It  is  well  known  that  the  conventional  edge 
derivative  operators  are  essentially  high-pass  filters 
that  are  sensitive  to  noise  and  not  suitable  for  ana¬ 
lyzing  oceanographic  satellite  images.  In  ocean  cir¬ 
culation  applications,  the  mesoscale  fronts  and  ed¬ 
dies  are  well  defined  and  continuous.  Thus  for  edge 
detection  from  satellite  imagery,  algorithms  must  be 
robust,  take  advantage  of  this  broad  characteristic, 
and  reject  fine  structure  while  retaining  edge  sharp¬ 
ness.  However,  gridded  data  from  model  output  are 
not  as  noisy,  and  gradient-based  algorithms  are  able 
to  extract  mesoscale  features  from  these  data.  We 
will  elaborate  on  this  more  in  Section  9. 

4.1  Navy’s  Cluster  Shade  Edge  Operator 

The  edge  detection  algorithm  developed  by 
Holyer  and  Peckinpaugh  (1989)  is  based  on  cluster 
shade  texture  measure,  which  is  derived  from  the 
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gray  level  co-occurrence  (GLC)  matrix.  Although  it 
has  been  known  that  the  GLC  matrix  contains  edge 
information  (Conners  et  al.,  1984),  it  had  not  been 
used  much  for  the  purpose. 

Using  the  GLC  matrix,  the  algorithm  computes 
the  ‘duster  shade  measure’,  which  is  essentially  the 
third  central-moment  of  intensity  levels.  Depending 
on  the  distribution  of  pixel  intensity  levels,  this 
measure  can  be  positive  or  negative.  Thus,  an 
edge  will  be  indicated  from  the  zero-crossing  of 
this  measure.  The  mathematical  formulation  of 
the  algorithm  has  three  steps:  (i)  computation  of 
the  GLC  matrix;  (ii)  computation  of  the  cluster 
shade  edge  measure;  and  (iii)  determination  of  the 
significant  zero  crossing. 

Computation  of  the  GLC  Matrix:  Its  (i,  j)th  element, 
P(i,j\Ax,Ay),  is  defined  as  the  relative  frequency 
of  the  pixel-pairs,  separated  by  (Ax,  Ay),  one  with 
intensity  level  i  and  the  other  with  intensity  level  j. 
For  an  MN  array  of  pixels  with  0, 1, ....  L  -  1  as  the 
levels  of  of  intensity,  define  a  mapping,  f(m,  n),  that 
assigns  intensity  level  to  the  pixel  (m,  n).  With  this, 
P(i,  j\Ax,  Ay)  is  defined  as 

M-Ar  TV -Ay 

P(*,j|Ax,Ay)  =  £  £  A  (I) 

m=l  n=l 


where 


if  /(m,  n)  =  i,  and 

f(m  +  Ax,  n  +  Ay)  =  j 
otherwise. 

(2) 


Computation  of  the  the  Cluster  Shade  Measure :  This 
measure,  denoted  by  S(Ax,  Ay),  is  defined  as 

L- 1 

S(Ax,Ay)  =  (i+  j  -  m  -  m?P(i,  j|Ax,  Ay)  (3) 

•,i=o 

where 

L- 1 

(«.;)-P(*.;|Ax,  Ay).  (4) 

*,;=o 


Computations  are  performed  in  local,  overlap¬ 
ping  neighborhoods,  and  the  center  pixel  of  the 


neighborhood  is  assigned  the  computed  value  of 
S(Ax,  Ay).  In  other  physical  applications,  image 
analysis  is  performed  with  several  (Ax,  Ay)  combi¬ 
nations.  However,  Holyer  and  Peckinpaugh  (1989) 
found  that  the  edge  detection  performance  of  the  al¬ 
gorithm  is  invariant  to  the  choice  (Ax,  Ay).  Thus, 
for  the  sake  of  simplicity  and  computational  effi¬ 
ciency,  they  chose  Ax  =  Ay  s  0.  With  this  choice, 
P(i,  j\Ax,  Ay)  =  0  for  i  ^  j,  and 

L-l 

P(i,  i)  m  P(i),  and  m  =  n  ■  ^PiP(i).  (5) 

»=o 

This  gives 

L-l 

SC  Ax,  Ay)  =  2  £(*  -  rfPO).  (6) 

1=0 

Determination  of  Significant  Zero  Crossings:  Because 
the  cluster  shade  measure  is  a  third  moment  from 
the  mean,  it  changes  its  sign  from  positive  to  neg¬ 
ative  as  it  goes  from  a  positively-skewed  neighbor¬ 
hood  to  a  negatively-skewed  one.  The  values  are 
largest  in  the  vicinity  of  the  GSNW.  The  values  are 
positive  to  one  side  and  negative  to  the  other  side  of 
the  wall,  and  the  transition  point  from  the  large  pos¬ 
itive  to  large  negative  value  of  the  measure  indicates 
the  GSNW. 

Tb  determine  significant  zero  crossings,  overlap¬ 
ping  33  pixel  neighborhoods  are  examined  in  terms 
of  their  cluster  shade  measures.  A  ‘0’  is  assigned  to 
the  center  pixel  if  the  absolute  value  of  its  cluster 
shade  measure  is  less  than  a  pre-defined  threshold 
value.  Otherwise,  the  neighboring  eight  pixel  val¬ 
ues  are  examined.  If  the  absolute  value  of  any  of 
them  is  larger  than  the  threshold  value  and  is  op¬ 
posite  in  sign  from  the  center  pixel,  a  ‘1’  is  assigned 
to  indicate  an  edge.  Thus,  the  entire  image  is  con¬ 
verted  into  Os  and  Is,  the  edges  being  indicated  by 
a  two-pixel  wide  line;  it  is  two-pixel  wide  because 
the  algorithm  detects  both  positive-to-negative  and 
negative-to-positive  transitions  of  the  cluster  shade 
measure  values. 


4 


This  algorithm  indicates  the  need  to  account  for 
cloud  cover  in  analyzing  the  AVHRR  images.  It  is 
suggested  that  masking  procedures  could  be  used  for 
this  purpose,  but  no  particular  algorithm  has  been 
outlined. 

4J2  URI  Edge  Detection  Efforts 

Evaluation  of  five  edge  detection  algorithms: 
Comillon  and  Watts  (1987)  evaluated  five  different 
methods  of  detecting  the  GSNW.  GS  maps  from 
155  AVHRR  images  were  analyzed  to  locate  the 
northern  edge  of  the  GS  off  Cape  Hatteras,  North 
Carolina.  One  method  was  the  subjective  location 
of  the  northern  edge  by  the  analyst;  the  other 
four  involved  the  objective  location  of  the  edge  by 
computer  using  the  various  statistics  of  the  SST 
field.  Specifically,  the  quantities  considered  were: 
maximum  SST  gradient  (calculated  over  a  3  x  3 
pixel  box),  maximum  SST  (pixel-by-pixel  basis), 
maximum  variance  (calculated  over  a  7  x  7  pixel 
box),  and  a  change  in  the  skewness  of  the  SST 
distribution  (calculated  over  a  5  x  5  pixel  box).  The 
resulting  locations  were  compared  with  the  location 
of  the  15  deg  isotherm  at  200  m  (7is)  determined 
from  the  inverted  echo  sounders  (IESs)  moored  on 
the  sea  floor.  The  best  method  that  yielded  the 
smallest  rms  difference  from  the  IES-derived  T\  5, 
was  the  subjective  one;  the  surface  front  was  located 
9.0  km  shoreward  of  T\5  with  a  rms  difference  of  14.3 
km.  The  best  objective  technique  used  skewness  of 
the  SST  distribution:  Each  pixel  in  the  image  was 
replaced  by  the  skew  of  the  twenty-five  SST  values 
obtained  from  a  5  x  5  pixel  square  centered  on  the 
pixel.  The  skew  changes  sign  when  a  step  in  the  SST 
data,  such  as  the  GS  northern  edge,  is  crossed.  The 
GSNW  located  from  the  skew  images  was  found  to 
be  14  km  shoreward  of  7i5  in  the  mean  with  a  rms 
difference  of  18.2  km.  In  general,  the  more  spatial 
information  used,  the  better  was  the  estimate. 

As  a  part  of  their  AVHRR  data  analysis  at  URI, 
Dr.  Cornillon  and  associates  have  done  a  consider¬ 
able  amount  of  work  on  developing  methodologies  to 
objectively  identify  features  from  the  SST  data,  with 


their  applications  mainly  geared  to  the  estimation  of 
the  GS  axis.  The  present  URI  algorithm  is  the  one 
developed  by  Cayula  and  Comillon  (1992),  hereafter 
referred  to  as  CC92. 

URI  Edge  Detection  Algorithm:  CC92  developed 
a  sophisticated  algorithm  to  objectively  define  ocean 
fronts  using  AVHRR  data.  This  algorithm  is  de¬ 
scribed  in  some  detail  in  the  following,  as  its  objec¬ 
tive  methodology  combined  the  physical  properties 
of  the  SST  and  clouds,  and  complemented  them  with 
statistical  methodology.  Although  edge  detection  is 
the  main  focus  of  the  paper,  the  problem  of  cloud 
detection  is  also  addressed  since  unidentified  clouds 
can  lead  to  erroneous  edge  detection. 

Cloud  Elimination :  The  removal  of  cloudy  re¬ 
gions  is  one  of  the  m^jor  problems  in  determin¬ 
ing  fronts  from  SST.  The  algorithm  performs  a  two- 
dimensional  3x3  median  filtering  on  the  original 
picture  and  removes  the  most  obvious  clouds  iden¬ 
tified  in  the  following  four  steps.  The  first  two 
steps  use  thresholds  on  temperature  and  temper¬ 
ature  gradients  determined  from  the  fact  that  the 
clouds  are  colder  than  the  ocean  and  that  they  are 
characterized  by  high  gradient  magnitudes.  The 
cloudy  regions,  so  determined,  are  put  through  two 
more  steps  to  ensure  that  the  regions  are  really 
cloudy.  Unlike  a  real  edge,  gradient  vectors  in¬ 
side  the  cloudy  region  are  not  coherent,  yielding 
a  smaller  magnitude  of  the  gradient  sum  as  com¬ 
pared  to  the  sum  of  the  gradient  magnitudes,  i.e., 
R  =  IE?=iv^l/E?=iiVT.'l  is  small,  where  VT,  is 
the  temperature  gradient  at  the  ith  pixel.  Thus, 
the  third  step  classifies  the  suspected  cloudy  regions 
from  Steps  1  and  2  as  cloudy  if  R  <  .3  and  clear  if 
R  >  .7.  For  the  regions  in  the  range  3  <  R  <  .7, 
the  cloudy  regions  are  bulky,  whereas  the  edges  are 
elongated  profiles.  Thus,  a  fourth  step  classifies  the 
suspected  region  as  cloudy  if  the  aspect  ratio  (larger 
eigen  value  of  the  spatial  covariance  matrix  divided 
by  the  smaller  eigen  value)  is  greater  than  6. 

Histogram  Analysis  for  Frontal  Specification:  For 
window  level  analyses,  the  CC92  algorithm  divides 
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the  total  picture  into  overlapping  windows  of  32  x 
32  pixels  and  performs  the  histogram  analysis  and 
tests  for  a  two-population  hypothesis  as  follows.  Let 
h(T)  be  the  height  of  the  histogram  in  the  interval 
denoted  by  T.  Further,  divide  the  data  into  two 
parts,  T  >  r  and  T  <r,  and  define 


Ur)  = 


N  i 

Ni  +  N2 


S\(t)  + 


n2 

nx  +  n2 


S2(t) 


(7) 


where 

Stir)  =  ]T[T  -  tatryphCn/Nu 

n. 

with  {fix  :  T  <  r)  and  {fi2  •'  T  >  r};  Ni  = 
r.-r.o,  h(T)  and  Hi(r)  —  y.Tcr^  Th{T)/ Nj.  Also, 
define 


J«r)mm7v^i{T)-,t*ryp-  (8) 

Then  the  total  variance  of  the  window  data  can  be 
partitioned  as: 


Stot  =  Ur)+Ur) 


where  X  is  a  population.  The  idea  is  that  if  the 
correlation  between  neighbors  is  close  to  unity,  or 
the  variance  is  small,  as  would  be  the  case  for  water, 
then  yx  will  be  small.  Thus,  a  threshold,  r,  can  be 
defined  so  that  yx  >  r  will  classify  the  region  as 
clouds.  Based  on  their  experiments,  CC92  chose  two 
thresholds,  one  for  cold  (r  =  4)  and  the  other  for  the 
warm  (T  =  8). 


It  is  quite  possible  for  the  histogram  analysis 
to  give  a  bimodal  histogram  suggesting  two  distinct 
temperature  subgroups  without  having  an  edge.  For 
there  to  be  an  edge  separating  the  two  subgroups, 
each  of  the  two  subgroups  has  to  form  a  connected 
subset;  i.e.,  for  a  given  pixel  that  is  not  close  to 
an  edge,  the  neighboring  pixels  should  belong  to 
the  same  population.  Thus,  after  eliminating  the 
cloudy  region  temperatures,  CC92  apply  a  Cohesion 
Algorithm  that  is  based  on  three  ratios: 


Ci 


Ri_ 

Ti' 


1,2, 


R\  +  R2 
=  Ty+T2  ' 


where  Je(r)  is  the  within-subgroups  variance  and 
Jt(r)  is  the  between-subgroups  variance.  Let  r  =  ropt 
maximize  6{r)  =  J4(r)/7tot(r).  Assuming  Normal 
distribution, 

9(ropt)  =  2/x  ~  0.63. 

Based  on  the  result: 

Pr{0(ropt)  <  0.7}  ~  0.99 

by  Duda  and  Hart  (1973),  CC92  chose  0(ropt)  =  0.7 
to  provide  the  threshold  to  check  whether  there  are 
two  different  temperature  populations  in  a  given 
window. 

Having  decided  that  there  are  two  temperature 
populations  in  a  window,  they  again  check  whether 
one  of  them  could  be  fine  clouds.  This  is  done  in 
two  steps:  (1)  examine  the  variability  within  each 
subgroup,  as  the  cloudy  regions  have  comparatively 
a  larger  variability;  (2)  perform  a  correlation  test  by 
computing 

yx  =  £(1*  -  y\  -  I E(x  -  y)|),  V(x,  y)  E  X  (9) 


with  Ti  and  Ri  are  defined  as 

Ti  =  |{(x,y) :  y  €  [Y(*)n  *]Vx  6  n,}|, 

Ri  =  |{(*,y) :  y  6  [Af(*)nn,-]Vx  g  n.}|, 

where  |.|  defines  the  cardinality  of  the  set,  and  Af(x) 
defines  the  neighborhood  of  the  pixel,  z,  as: 

N )  =  J  +  l  i  Xi,j  —  1 !  xi  —  l,j  >  *1  +  1  ,j  }  • 

In  fact,  to  save  computations,  they  used  a  modified 
neighborhood  set N'{xij )  =  {z,+ii;-,z,J+i}.  Athresh- 
old  of  0.92  for  C  and  0.90  for  Q  was  chosen  so  that  an 
edge  hypothesis  within  a  window  is  rejected  if  either 
C  <  0.92  or  Q  <  0.90.  Derivation  of  these  thresh¬ 
olds  is  based  on  probability  of  error  formulation. 

Having  confirmed  that  a  window  contains  an 
edge,  the  last  step  in  the  window  level  processing 
is  to  specify  the  pixels  that  correspond  to  the  edge. 
First,  define  an  indicator  function  that  assigns  to 
each  pixel  within  the  window  a  digital  count,  7(x)  =  0 
if  x  6  Oi,  and  /(x)  =  1  if  x  G  n2.  A  pixel  x  is  defined 
as  an  edge  pixel  if  I(x)  4  Ky),  y  G  N'ix). 
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5*  FEATURE  I  .AttPlT  JJfG 

After  the  feature  edges  have  been  determined, 
the  next  step  in  the  automatic  feature  construction 
process  is  to  label  these  edges  according  to  the  fea¬ 
tures  present.  For  instance,  in  the  GS  system  edges 
can  correspond  to  several  types  of  features,  e.g., 
warm  eddy,  cold  eddy,  North  Wall,  and  South  Wall. 
Unless  the  labeling  is  performed  with  sufficient  ac¬ 
curacy,  the  subsequent  feature  description  will  be 
faulty. 


5.1  Relaxation  Labeling  Algorithm 


In  the  Navy’s  algorithm,  the  labeling  is  per¬ 
formed  by  a  nonlinear  probabilistic  relaxation  pro¬ 
cess  (Krishnakumar  et  al.,  1990a),  which  requires 
an  initial  guess  of  the  probabilities  of  edge  segments 
belonging  to  each  feature.  This  first  guess  is  usu¬ 
ally  available  from  a  previous  analysis  and  moving 
it  forward  in  time  (Holyer  and  Peckinpaugh,  1990). 
The  relaxation  algorithm  is  composed  of  two  steps. 
The  first  step  computes  a  priori  probabilities  from 
a  ground  truth  data  or  a  recent  mesoscale  analysis. 
The  second  step  iteratively  updates  these  a  priori 
probabilities  until  a  consistent  labeling  is  achieved. 
A  mathematical  formulation  by  Krishnakumar  et  al. 
(1990a)  is  as  follows. 


Mathematical  Framework:  Let  A  =  { Ai , . . . ,  Am} 
be  the  set  of  possible  feature  labels  that  can  be  as¬ 
signed  to  pixels  x  within  a  scene.  Correspondingly, 
let  p$(x,t)  be  the  probability  that  the  pixel  x  is  as¬ 
signed  labels  A  after  k  iterations  of  the  relaxation 
algorithm.  The  probabilities  are  allowed  temporal 
dependence  so  as  to  take  advantage  of  the  temporal 
continuity  of  feature  evolution. 


Step  1:  Estimation  of  a  priori  probabilities:  Let 
p$(x,t)  denote  the  initial  probabilities  assigned  to 
the  pixel  x(i,  j)  at  time  t.  Then,  using  the  Bayes 
theorem, 


„0<r  ,x_  p(a,t|A)P(A) 

Px  ’  ;~£ap(MA)P(A) 

where  P{  A)  probability  of  occurrence  of  the  feature  A 
and  p(x,  <|A)  is  the  conditional  (multivariate)  proba¬ 
bility  density  function  of  a  vector  X  derived  from  the 


physical  properties  of  the  pixel  x(i,j),  viz,  distance 
and  the  direction  of  the  pixel  x(i,j)  from  the  origin, 
gray  scale  intensity,  and  the  edge  magnitude  derived 
using  the  edge  detector  algorithm.  Thus,  p(z,<|A)  is 
specified  as 

*.l|»  -  (fcpup-t  exp  { =SL^gL-*s>} , 

where  n\  and  SA  are  the  mean  and  covariance 
matrix  of  the  object  A.  Probabilities,  P( A)  are 
computed  as  relative  frequencies 


P(  A)* 


n\ 


£a  "a 

where  rix  are  the  number  of  pixels  in  the  object  A. 


Step  2:  Iterative  Updating:  Knowing  the  probabili¬ 
ties,  p%(x,t),  at  iteration  k,  they  are  updated  as 

n*+i/_  p£(MXl  +  qkx(x)) 

P\  V®,  t)  —  Qr,x  — ,  , 

£a  Pa  (*><)?*(*)) 

+  (1-  ax)p\(x,t') 


where  q$(x)  are  called  the  updating  factors,  ax  are 
temporal  weighting  functions,  and  px(x,t‘)  is  the 
probability  at  a  prior  time  H  <  t.  In  some  cases, 
the  temporal  dependence  can  be  removed  by  setting 
ax  =  1.  The  updating  factors  are  computed  at  each 
iteration  as 


qj(z)  =  (1/m)  rAv(x,  y)px-(x) 

y  A- 

where  rAA'(x,  y)  are  known  as  the  compatibility  co¬ 
efficients  and  behave  like  a  correlation  coefficient. 
For  example,  - 1  <  rAA.(x,  y)  <  1;  it  is  positive  if  A  on 
x  co-occurs  with  A'  on  y,  negative  if  they  frequently 
do  not  co-occur,  and  equality  with  ±1  is  achieved 
if  the  co-occurrence/nonco-occurrence  is  perpetual. 
The  iterative  process  is  terminated  when  the  differ¬ 
ence  between  two  successive  probabilities  is  negligi¬ 
ble.  For  greater  detail,  see  Kittler  and  Illingworth 
(1985)  where  the  method  of  their  computation  is  ex¬ 
plained. 


The  choice  of  the  temporal  weighting  function 
is  arbitrary  and  considered  to  be  time  varying.  In 
certain  situations,  the  use  of  this  formulation  may 
be  problematic;  a  clear  direction  towards  its  choice 
needs  to  be  formulated.  In  description  of  the  Navy’s 
expert  system,  ax  -  1  (Krishnakumar  et  al.,  1990b, 
1991). 
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5.2  Contour-Following  Algorithm 

The  URI  system  does  not  have  a  contour  labeling 
algorithm  per  se.  However,  as  a  follow-on  to  the 
edge  detection  step,  the  system  connects  the  pixels 
identified  as  an  edge  to  form  a  continuous  curve 
to  give  it  a  frontal  shape.  Thus,  this  algorithm 
identifies  p  different  contours  without  assigning  any 
label.  This  is  as  opposed  to  the  Relaxation  Labeling 
Algorithm  where  each  pixel,  identified  as  an  edge 
pixel,  is  assigned  a  feature  label. 

The  contour  following  algorithm  assigns  to  the 
nth  pixel  of  the  pth  contour  the  paired  value  of  (n,  p). 
Among  the  edge  pixels  detected  at  the  window  level 
that  are  neighbors  of  (n,p),  the  algorithm  selects 
that  pixel  as  (n  + 1,  p)  which  affects  the  least  change 
in  the  direction  of  the  contour.  However,  no  pixel 
is  added  if  the  direction  has  to  change  more  than 
90  deg  in  5  pixels.  When  no  previously-detected 
edge  pixel  can  be  added  to  the  contour,  the  algorithm 
examines  the  ratio  R  =  E"»lV7j|/£"_1|V7<|  in  a 
3x3  pixel  window  centered  on  the  last  contour 
pixel,  where  V7<  is  the  temperature  gradient  at  the 
ith  pixel.  If  R  >  .7,  then  the  algorithm  adds  to 
the  contour  that  one  pixel  from  the  3x3  window 
for  which  the  scalar  product  of  the  gradient  vector 
at  the  pixel  point  with  the  gradient  vector  at  the 
last  contour  point  is  the  maximum.  Because  the 
algorithm  relies  only  on  the  first  neighbors  of  the  last 
contour  edge  pixel,  it  is  capable  of  resolving  more 
than  one  front  in  a  window.  Finally,  the  algorithm 
eliminates  isolated  edge  pixels  by  deleting  contours 
with  fewer  than  15  pixels. 

6.  SPATIAL  INTERPOLATION 

ALGORITHMS:  DETERMINATION  OF 

THE  GULF  STREAM  LOCATION 

Because  of  cloud  cover,  the  labeled  edges  pro¬ 
vide  quite  sparse  information  on  the  GS  front,  and 
it  must  be  filled  via  interpolation.  There  are  three 
different  approaches  at  this  point.  The  first  one  is 
purely  statistical,  which  is  based  on  empirical  infor¬ 
mation  on  the  mean  and  the  covariance  structure 


of  the  front.  The  mean  and  covariance  structure 
of  the  feature,  derived  from  climatological  records, 
are  combined  with  the  partial  observations  on  the 
feature  for  an  overall  estimation  of  the  feature. 
The  mean  provides  an  anchor,  while  the  covariance 
structure  constrains  the  estimated  solution  to  con¬ 
form  with  the  observations.  The  success  of  this  ap¬ 
proach  depends  on  the  accuracy  of  the  mean  front, 
the  variance  of  the  front  at  different  locations,  and 
the  adequacy  of  the  covariance  parameterization. 
The  next  approach,  based  on  space-time  interpola¬ 
tion,  overcomes  these  drawbacks  and  incorporates 
the  past  information.  The  last  approach  is  that  of  dy¬ 
namical  interpolation  where  the  sparse  information, 
that  may  or  may  not  pertain  to  the  front,  is  assim¬ 
ilated  into  a  dynamical  model  to  provide  a  synoptic 
update. 

For  the  statistical  approach,  four  algorithms  are 
reviewed.  Not  surprisingly,  these  are  based  on  the 
well-known  techniques  of  Optimum  Interpolation, 
Empirical  Orthogonal  Functions,  Contour  Analysis 
and  Kalman  Filtering,  which  are  often  used  in  mete¬ 
orology  and  oceanography.  The  first  method,  known 
as  the  Pathfinder  algorithm  by  Horton  (1989),  is  ap¬ 
plied  for  the  objective  determination  of  the  surface 
GSNW.  The  algorithm  is  based  on  the  optimum  in¬ 
terpolation  technique  of  Bretherton  et  al.  (1976) 
to  interpolate  observations  on  the  GS  location  us¬ 
ing  the  GS  climatology  as  the  first  guess.  The  sec¬ 
ond  method  is  based  on  the  well-known  empirical 
orthogonal  functions  (EOF)  and  applied  also  for  the 
location  of  the  surface  GS  axis.  This  technique,  de¬ 
scribed  by  Molinelli  and  Flanigan  (1987),  was  origi¬ 
nally  proposed  by  Carter  ( 1985).  The  premise  is  that 
the  first  few  CEOF  modes  incorporate  the  correla¬ 
tion  of  the  relative  location  of  the  fixes  along  the  GS 
axis.  These  two  algorithms,  however,  focus  on  the 
current  epoch  of  estimation  and  do  not  incorporate 
information  from  the  past  (or  the  future)  estimation. 
The  contour-melding  procedure  by  Mariano  (1990) 
utilizes  the  phase  speeds  computed  from  the  past 
and  future  estimations  to  perform  an  overall  inter¬ 
polation  of  the  GSNW  position.  Chin  and  Mariano 
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(1993)  take  it  one  step  further  and  cast  the  estima¬ 
tion  problem  in  terms  of  a  Kalman  filter  type  formu¬ 
lation. 

This  section  is  organized  as  follows.  We  start 
with  two  purely  statistical  algorithms  that  combine 
partial  information  on  a  feature  with  empirically- 
derived  means  and  spatial/temporal  correlations. 
This  is  followed  by  work  using  a  semi-statistical  pro¬ 
cedure.  Dynamic  interpolation,  or  data  assimila¬ 
tion,  is  perhaps  the  best  known  technique  in  this 
category.  It  combines  output  of  a  numerical  ocean 
model  with  observations  that  may  not  necessarily  be 
on  the  feature.  The  feature  is  then  extracted,  us¬ 
ing  some  specified  formulation  of  the  feature  defini¬ 
tion.  OFI  work  using  AVHRR  data,  done  by  Comil- 
Ion  and  colleagues,  pertains  to  a  different  type  of 
semi-statistical  approach  in  that  it  exploits  the  phys¬ 
ical  characteristics  of  the  variable  underlying  the 
feature,  and  uses  statistics  to  establish  thresholds  to 
test  the  sequential  hypotheses  to  extract  features. 

6.1  The  Pathfinder  Algorithm 

The  Pathfinder  algorithm  is  based  on  the  mean 
and  covariance  concept.  It  converts  observed  (lon¬ 
gitude,  latitude)  positions  into  (downstream,  cross¬ 
stream)  anomalies  relative  to  a  mean  path.  Then  the 
approach  assumes  that  the  GS  meanders,  in  terms  of 
cross-stream  anomalies  along  the  mean  path,  form  a 
second-order  stationaiy  process,  i.e.,  the  covariance 
between  two  cross-stream  anomalies  is  a  function  of 
the  downstream  distance  between  them.  The  imple¬ 
mentation  of  the  algorithm  is  described  in  terms  of 
the  objective  analysis  technique  of  Bretherton  et  al. 
(1976).  This  01  technique  is  used  to  estimate  the 
GS  path  from  a  limited  number  of  observations  on 
the  GS  positions  with  varying  ages. 

The  brief  description  given  below  provides  elab¬ 
oration  of  the  statistical  details  of  the  method. 

Let  (xj ,  yj ),  j  -  1,  • .  • ,  n  be  given  observations  on 
the  GS  axis,  where  Zj  is  the  downstream  anomaly 
and  yj  is  the  measurement  on  jth  cross-stream 
anomaly  The  problem  is  to  estimate  rj  =  rj(x)  for 


a  specified  downstream  anomaly  x.  Then  j$  can  be 
expressed  using  a  simple  statistical  model: 

yj-Vj+tj,  i  =  1, ....  n  (10) 

where  statistical  expectation  of  jj  is  zero,  i.e.,  Eiy, )  = 
0.  The  observed  GS  positions  are  contaminated  by 
random  measurement  errors,  e;  ,  that  are  uncorre¬ 
lated  with  one  another,  i.e.,  £(e;e*)  =  with 

Sjt  being  the  Kronecker’s  Delta  function  and  of  the 
measurement  error  variance.  The  measurement  er¬ 
rors  are  uncorrelated  with  GS  position  anomalies, 
i.e.,  £(cjT)k)  =  0.  The  statistical  model  is  completed 
with  a  specification  of  the  covariance  between  vari¬ 
ous  observation  pairs  in  terms  of  the  covariance  ma¬ 
trix  A  =  (Aij )  where 

Aij  =  £(y<y, )  =  SOnm )  + 

=  Cij  +  . 

The  covariance,  Qj,  is  a  function  of  the  GS  spreads 
(variances)  and  oj  at  downstream  distances  z,  and 
Xj  and  the  correlation  function,  cfe,  z;  ).  Thus, 

Cij  —  (TifTjdXi,  Xj). 

Let  the  covariance  of  tj  -  dx)  with  each  rjj  be 
given  by 

CXj  =  £(rj(x)T]j). 

With  this  setup,  we  can  estimate  tj(x)  as  a  linear 
function  of  the  observations: 

n 

r)(x)  =  Y^ajyj’  (11) 

1=1 

where  the  coefficients  a j  are  derived  from  a  least 
squares  formulation  as  follows.  Let  Y  =  (ja  , . . . ,  y„  X, 
C(z)  =  C  =  (Cxi,  ■  ■  -,Cxny,  where  '  represents 
matrix  transposition.  Then,  according  to  the  least 
squares  formulation,  we  minimize 

£(rj  -  fj )2  =  £(tj(x)  -  a'Y)2 

=  £[rj(x)2  -  2a'Y77(z)  +  a'YY'a] 

=  cXx  -  2a'C  +  a'Aa 
=  [a  —  A-1C]'A[a  —  A-1C] 

+  CIX  —  C'A-1C.  (12) 
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Note  that  the  last  two  terms  in  eq.  (12)  are 
independent  of  a,  and  the  first  term  involving  a  is 
a  positive  definite  quadratic  form  in  the  matrix  A. 
Thus,  the  minimum  is  obtained  by  setting 

a  *  A-1C, 

and  the  estimation  error  is  given  by 
£(r£x)  -  a'Y?  =  Cxx  -  C'A~lC 

n 

=  cxx  -  £  CXrCx,Ar> •  (13) 

r,*=l 

where  Ar>  is  the  rsth  element  of  A-1.  This  deriva¬ 
tion  is  similar  to  that  in  Bretherton  et  al.  (1976),  but 
simpler  due  to  the  use  of  the  matrix  notation.  Thus, 
y(x)  for  a  specified  x  is  estimated  as: 

y(x)  =  o' Y 

=  C'A-1  Y 

(14) 

i= i  \t=i  / 

In  both  Horton  (1989)  and  Bretherton  et  al.  (1976), 
eq.  (14)  has  been  written  as: 


This  unconventional  notation  is  liable  to  cause  con¬ 
fusion,  since  Aj£  is  interpreted  as  1/A,*  and  not  as 
A>*. 

Horton  (1989)  makes  the  basic  assumption  of 
stationarity  of  covariance  in  terms  of  the  down¬ 
stream  coordinate  x.  For  two  points  at  downstream 
distances  x  and  r,  he  formulated  his  equation  (14) 
in  terms  of  a  prescribed  correlation  function,  = 
d\x  -  r|),  given  by: 

cxr  *  cos(2t|i  -  r|/u>i)exp(-|x  -  r|l  5)  (15) 

where  wt  =  450  km  is  the  wavelength  and  A  is 
the  decay  variable,  =  .00032001  km-3/2.  Thus, 
Cxr  =  <7x<rrc(\x  -  xr\),  where  <t\  is  the  variance  of  the 
meander  at  the  downstream  coordinate  x. 


For  further  details  on  the  operational  implemen¬ 
tation  of  the  model,  see  Horton  ( 1989),  where  he  sug¬ 
gests  ways  of  using  old  observations.  GS  front  lo¬ 
cation  estimation  errors  are  due  to  several  sources; 
some  of  them  are:  observation  error  variance,  vari¬ 
ance  of  the  meander  envelope  around  the  mean  front 
location,  the  assumption  of  stationary  covariance 
and  the  choice  of  the  corresponding  correlation  func¬ 
tion.  Based  on  his  simulation  experience,  Horton 
(1989)  concludes  that: 

•  The  mean  path  errors  are  not  well  understood. 
The  variance  of  the  mean  path  error  might 
be  typically  about  25  percent  of  the  meander 
envelope. 

•  The  effect  of  the  mean  path  errors  can  be  reduced 
by  modifying  the  correlation  function  used  in  the 
optimum  interpolation,  which  lets  the  observa¬ 
tions  modify  the  mean  path. 

•  Using  the  optimum  inteipolation  method,  the 
errors  in  the  interpolated  paths  were  below  50 
km,  even  with  observational  gaps  of  up  to  several 
degrees  longitude.  However,  he  conjectured  that 
the  inadequate  sampling,  coupled  with  atypical 
meanders  such  as  ring-formation  events,  might 
produce  errors  two  to  three  times  as  great. 

6.2  Complex  Empirical  Orthogonal  Function 
Algorithm  for  the  Gulf  Stream  Path 

The  empirical  orthogonal  function  (EOF)  ap¬ 
proach  is  another  statistical  approach  commonly 
used  in  the  literature  to  capture  the  overall  correla¬ 
tion  of  a  space-time  feature  in  a  few  vectors.  These 
vectors  are  the  eigen  vectors  of  the  empirically- 
estimated  covariance  matrix  of  the  feature  and  are 
referred  to  as  the  EOFs  of  the  feature.  To  construct  a 
specific  synoptic  realization  of  the  feature,  one  sim¬ 
ply  computes  a  weighted  sum  of  these  EOFs,  where 
the  weights  sire  given  by  the  EOF  ‘amplitudes’  cor¬ 
responding  to  the  feature  realization.  Thus,  the 
EOF  approach  involves  two  steps.  One  is  to  derive 
the  EOFs  from  the  empirically-estimated  covariance 
matrix  of  the  feature.  This  step  is  performed  once  for 
all  subsequent  construction  (estimation)  of  the  fea¬ 
tures.  The  second  step  requires  converting  partial 
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Figure  1.  CEOF  optimization  geometry;  P,  is  an  observation  and  P„eore»<  is  its  projection  on  lineraly 
segmented  GS  path  constructed  at  an  earlier  step. 


observations  on  a  feature  realization  into  EOF  am¬ 
plitudes  so  that  the  weighted  sum  of  the  EOFs  can  be 
computed  to  derive  the  entire  realization.  This  step 
is  repeated  each  time  a  feature  realization  has  to  be 
constructed.  Carter  (1985)  proposed  the  concept  us¬ 
ing  complex  EOFs  (CEOFs)  to  describe  the  GS  path. 
Molinelli  and  Flanigan  (1987),  hereafter  referred  to 
as  MF87,  followed  it  up  by  developing  an  algorithm 
that  successively  updates  the  CEOF  analyses  with 
a  partial  set  of  observations  to  yield  a  new  fix  of  the 
GS  axis.  In  this  formulation,  a  location  (ar,  y)=  (longi¬ 
tude,  latitude)  is  represented  as  a  complex  number 
tv  *  x  +  iy.  Thus,  w,  =  Xj  +  iy,  are  the  locations  along 
the  Gulf  Stream  axis  at  equal  spatial  distance  inter¬ 
vals,  so  that  the  vector  w  =  (u^ , . . .  w^y, '  indicating 
matrix  transposition,  defines  a  discrete  representa¬ 
tion  of  the  Gulf  Stream.  Let  E  =  (tvitvj),  where  *  in¬ 
dicates  complex  conjugate  and  ()  denotes  ensemble 
average.  Let  e; ,  j  =  1, ....  n  be  the  eigen  vectors  of 


E  with  \j  as  the  corresponding  eigen  values.  Since 
E  is  Hermitian,  the  eigen  values  A,  are  real;  how¬ 
ever,  the  eigen  vectors  are  complex.  With  this,  we 
can  define  rotated  variables  (complex  amplitudes) 

Cfc  =  ejw  (16) 

where  ej  is  the  complex  conjugate  transpose  of  e*. 
The  original  discrete  GS  representation,  w,  can  be 
recovered  from  c*  as  the  weighted  sum  of  the  EOFs: 

N 

w  =  =  Ec,  (17) 

k=l 

where  E  =  =  (ei ,  ...e/v)  and  c  =  c;v  = 

(ci, .  ..,cNy. 

At  this  time,  we  reiterate  some  points  pertaining 
to  the  GS  estimation  using  the  CEOFs: 

•  The  EOFs,  represented  by  the  eigen  vectors  e, 
are  obtained  from  the  statistical  population  of  the 
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Gulf  Stream  axis,  and  thus,  entail  the  overall 
correlation  structure  between  various  locations 
of  the  Gulf  Stream  axis; 

•  However,  the  amplitudes  ct  represent  character¬ 
istics  of  an  individual  GS  realization  and  vary 
from  one  GS  realization  to  another. 

»  In  general,  the  overall  variability  of  the  system 
can  be  explained  by  a  small  number,  say  n,  of 
modes.  Thus,  a  GS  realization  w  in  eq.  (17) 
can  be  approximated  as  the  weighted  sum  of  the 
dominant  EOFs: 

n 

w  ~  ■  E„c„,  (18) 

t= 1 

•  We  observe  from  eq.  (18)  that  E„  acts  as  a  basis 
set’  of  vectors,  and  a  GS  realization  is  obtained 
as  a  linear  combination  of  the  basis  set.  Writing 
ct  s  at  exp(t>t)>  we  note  that  weighting  with  c* 
results  in  an  amplification  as  well  as  a  rotation 
of  the  EOF  vector,  et. 

MF87  solved  the  problem  of  estimating  a  GS 
realization  from  only  a  partial  information  using  the 
CEOF  formulation.  For  this,  the  partial  information 
is  mapped  onto  complex  amplitudes,  c„  =  (ci , . . .  cn  Y , 
so  that  the  realization  w  could  be  constructed  using 
eq.  (18).  The  MF87  implementation  was  based 
on  empirical  data  sets  derived  from  84  realizations 
of  the  Gulf  Stream  axis,  interpolated  at  vV  =  132 
equidistant  points  with  10  km  spacing  along  the 
GS  axis.  They  could  describe  the  Gulf  Stream 
shapes  with  as  few  as  n  =  10  modes,  accounting 
for  99  percent  of  the  variability.  The  root  mean 
square  difference  between  the  132  estimated  and  the 
empirical  GS  fixes,  with  n  =  10  using  eq.  (18),  was 
~  6  nmi. 

A  nonlinear  least  squares,  gradient-search  algo¬ 
rithm  was  used  to  map  the  partial  observations  on 
the  GS  axis  onto  the  complex  amplitude  vector  q,. 
The  approach  is  to  start  with  a  set  of  q,  i  =  1,  •  •  • ,  n 
based  on  a  well-defined  snapshot  of  GS,  and  then  op¬ 
timally  adjust  the  q  until  the  resulting  w  deviates 


from  the  few  available  GS  fixes  in  a  least  squares 
sense.  Mathematically,  the  formulation  is  as  follows. 

At  any  time,  let  p, , j  =  1,  -,K  be  partial 
information  on  the  GS  axis  in  terms  of  K  position 
fixes.  An  individual  p,  could  be  from  any  source, 
including  IR  and  SSH.  Then  according  to  the  least 
squares  criterion,  we  would  like  to  determine  a  GS, 
w  =  2*=1etc*,  that  minimizes 

K 

x2  =  2|p<  -  Piwl2,  (19) 

•  =1 

where  of  is  the  position  error  variance  for  r,  and 
PiW  is  the  projection  of  the  observed  position  r 
on  the  to-be-determined  discrete  GS  representation 
w;  see  Fig.  1.  The  details  of  the  minimization 
procedure  used  are  standard  (see  Bevington,  1969), 
and  are  thus  omitted. 

The  minimization  of  X2  in  eq.  (19)  is  nonlinear 
in  the  parameters  q,i  =  1,  •  •  • ,  n,  and  thus  requires 
their  initial  guesses.  These  guesses  are  obtained  by 
substituting  in  eq.  (16)  a  pre-existing  discrete  GS  re¬ 
alization  w0.  Since  the  above  minimization  requires 
the  estimation  of  2n  parameters,  corresponding  to 
the  real  and  imaginary  parts  of  q,,  we  must  have 
K  >  n.  The  MF87  study  indicates  that  for  a  GS 
axis  estimation,  the  number  of  fixes  from  SSH  and 
AVHRR  is  typically  in  the  25  to  75  range,  and  thus 
quite  adequate  for  estimation  purposes. 

According  to  the  simulation  results  of  MF87, 
this  procedure  provides  an  accuracy  of  30  nmi,  the 
same  as  obtained  in  manual  analyses  performed  by 
WSC.  However,  because  the  computer  optimization 
is  automated,  it  minimizes  manual  analysis. 


Mariano  (1990)  came  up  with  a  contour  analy¬ 
sis  approach  for  melding  different  analyses  of  geo¬ 
physical  fields,  which  he  adapted  for  interpolating 
the  GSNW  position  from  gappy  information.  He  ar¬ 
gued  that  the  combining  of  two  or  more  different 
fields  using  the  usual  weighted  averaging  or  least 


8.3  CONTOUR  ANALYSIS 
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Figure  2.  Example  of  contour  analysis  interpolation  of  GSNW  (Adapted  from  Mariano,  1990) 


squares  approach  would  tend  to  smear  the  informa-  the  same  fraction  of  the  arc  lengths  along  each  con- 

tion  on  the  features.  To  alleviate  this  problem,  he  tour.  Mariano’s  implementation  of  this  correspon- 

proposed  contouring  the  various  analyses  and  meld-  dence  is  as  follows: 


mg  the  positions  of  the  features  as  specified  by  con¬ 
tours,  rather  than  averaging  the  fields.  This  ap¬ 
proach  has  the  distinct  advantage  of  retaining  the 
shapes  of  the  features  while  appropriately  adjust¬ 
ing  the  amplitude  and  phase  information.  For  a 
formal  but  brief  description  of  the  Contour  Analysis 
approach,  consider  the  melding  of  two  fields  charac¬ 
terized  by  sets,  G,  i  =  1,2,  of  contours  such  that 

Ci  -  {GO')}, 

where  j  represents  the  contour  number,  and  melding 
is  to  be  performed  by  combining  corresponding  con¬ 
tours  in  the  two  fields.  Let  two  such  correspondiiig 
contours  be  specified  by  Ci  and  C2,  where  the  con¬ 
tour  index  j  is  dropped  for  notational  convenience. 
Then  Q  are  represented  by  longitude,  latitude  posi¬ 
tions  as 

Ci  *  {(z«t, ytk ), k  —  1, 2,  •  •  • , f>i}.  (20) 

At  this  stage,  there  is  a  need  to  set  up  a  corre¬ 
spondence  between  Ci  and  Cj  in  terms  of  the  geo¬ 
physical  feature  to  be  extracted.  The  correspon¬ 
dence  is  established  by  selecting  points  located  at 


Compute  arc  length:  For  each  contour,  start  with 
the  original  sets  of  points  Q  :  ( xa,y,t ),  *  = 
1,  •  •  • ,  n,-,  and  define  arc  lengths  as  follows.  Let 
Axik  =  (*it  -  Xi(t-i))  and  Ayik  »  (j -  y,(*-i)>. 
Then,  AS, (it)  =  (Ai-t  +  A y?t)  is  the  arc  length 
between  points  (x,(*-i),  y,(*_i))and  (x,*,ylt).  The 
arc  length  corresponding  to  the  1th  point  (from 
the  first  point)  is  given  by 

1 

Sit  =  Si(k); 

k  —  2 

The  total  arc  length  for  the  ith  contour  is  denoted 
by  Si. 

Set  up  correspondence  between  Ci  and  C2'  This 
correspondence  is  set  up  in  two  steps:  (1)  First, 
for  each  contour,  G,  fit  separate  cubic  splines 
to  Xik  and  as  a  function  of  the  arc  length, 
Sik-  (2)  From  the  splines,  then  interpolate  at  N 
(the  same  number  for  the  two  contours)  of  new 
points,  (Xik,Yik),  k  =  1, . . . ,  N  at  equal  arc  length 
intervals,  AS,  *  Si/N.  The  choice  of  arc  length 


13 


as  the  independent  variable  permits  melding  of 
multivalued  features  with  an  appropriate  phase. 

•  With  this  at  hand,  the  melded  contour  is  calcu¬ 
lated  as; 

2  2  \ 

WikXik ,  ik  1  ,  (21) 

»=i  «=i  / 

for  it  =  1,  •  •  • ,  N,  where  the  weight  u>,*  may  be  chosen 
proportional  to  the  inverse  of  the  error-variance 
assigned  to  each  position,  such  that  u**  +  w2k  =  1. 

Mariano  applied  the  above  contour  melding 
technique  to  incomplete,  time-varying  fields  to  per¬ 
form  space-time  interpolation  of  the  GSNW  for  824 
cases.  The  SST  data  from  the  AVHRR  were  mapped 
to  a  common  coordinate  system  and  composited  into 
two-day  groups  at  the  URI,  retaining  the  warmest 
pixel  out  of  approximately  ten  satellite  passes.  The 
GSNW  position  was  then  hand  digitized. 

Due  to  cloud  cover,  the  resulting  data  sets  from 
the  above  operations  suffered  from  several  spatial 
and  temporal  gaps  that  needed  to  be  filled.  To 
demonstrate,  Fig.  2  adapted  from  Mariano  (1990) 
shows  two  examples  of  filling  gaps  in  the  the  GSNW 
positions  using  contour  analysis  interpolation.  Fig. 
2b  shows  the  hand  digitized  GS  region  on  Day  6336.5 
with  gaps  between  66°  and  58°  W.  The  first  step  to  fill 
these  gaps  was  to  find,  in  the  immediate  past  (Day 
6332.5,  Fig.  2a)  and  the  future  (Day  6338.5,  Fig. 
2c),  GSNW  records  which  provided  coverage  for  the 
gappy  area  under  consideration.  For  a  proper  ap¬ 
plication  of  the  contour  analysis  algorithm,  the  two 
records  were  propagated  to  a  common  analysis  time. 
A  phase  speed  was  computed  by  averaging  the  prop¬ 
agation  8 peed  of  the  local  maximum  and  minimum 
between  the  past  and  future  time  assuming  a  dom¬ 
inant  east-west  propagation  speed.  The  latitudes 
were  propagated  forward  for  the  past  and  backward 
for  the  future.  With  the  time  synchronization  done, 
the  contour  analysis  fitted  cubic  splines  to  latitudes 
and  longitudes  of  the  *past’  and  ‘future’  records  as 
a  function  of  the  arc  lengths.  These  splines  were 


interpolated  to  create  the  new  data  sets  at  equal 
arc  lengths,  which  were  then  averaged  according  to 
eq.  (21)  to  obtain  the  fully-interpolated  data  sets. 
The  weights,  w,*  were  taken  to  be  inversely  propor¬ 
tional  to  the  time  between  the  present  gap  and  the 
time  of  the  past  and  future  data.  Thus  for  the  past, 
wi*  =  1/3,  and  for  the  future,  u* t  =  2/3.  Fig.  2d-f 
show  another  example  of  similar  interpolation. 

Summarizing  the  Contour  Analysis  approach  to 
interpolate  the  gappy  areas  of  the  GS  positions  we 
note: 

•  The  GSNW  is  a  multivalued  phenomenon  which 
has  been  difficult  to  resolve.  Mariano’s  approach 
is  quite  ingenious  in  fitting  splines  to  latitude 
and  longitudes  as  functions  of  an  independent 
variable,  arc  length. 

•  The  approach  combines  the  phase  and  ampli¬ 
tudes  of  two  sets  so  as  to  preserve  an  appropriate 
shape. 

•  Mariano  (1988)  has  applied  this  approach  to 
create  an  atlas  of 824  GSNW  positions  with  quite 
satisfactory  results. 

»  Algorithm  performance  deteriorates  for  those 
cases  involving  ring-births  or  strong  ring-stream 
interactions. 

•  Algorithm  performance  can  be  improved  by  em¬ 
ploying  two-dimensional  phase  speeds  and  also 
by  using  better  ways  to  calculate  these  phase 
speeds. 

•  In  simulation  experiments  performed  by  Mari¬ 
ano  (1990),  he  found  the  interpolation  error  was 
of  the  order  of  10  km. 

»  The  largest  interpolation  errors  (50-80  km)  were 
found  in  the  meander  amplitudes  in  the  vicinity 
of  58°  W  in  each  curve. 

6.4  Optimal  Space-Time  Interpolation 

Mariano’s  approach  of  using  the  arc  length  as 
the  independent  variable  to  interpolate  the  GS 
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Figure  3.  Examples  of  complex  GS  meanders  in  ‘S’  and  ‘O'  shapes.  Estimation  errors  are  indicated 
by  the  size  of  *+’  signs  (Adapted  from  Chin  and  Mariano,  1993) 


path  was  a  major  step  in  being  able  to  resolve 
complex  and  multivalued  meanders  of  ‘S’  and  ‘fl’ 
shapes.  It  also  introduced  the  use  of  past  and 
future  observations  in  computing  the  GS  path  at 


the  present  time.  However,  the  approach  was 
heuristic.  Chin  and  Mariano  (1993),  referred  to 
as  CM93  in  the  following,  enhanced  the  Mariano 
approach  by  formalizing  the  concept  via  a  Kalman 
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type  ‘predictor-corrector’  approach,  while  retaining 
the  arc  length  concept.  Let  pis,  t)  =  [*(«,  t),  yis,  t)J 
be  the  true  GS  contour  locations  at  time  t  and  arc 
length  8  relative  to  some  reference  point.  The  arc 
length  8  of  p  is  discretized  so  that  the  final  GS  is 
represented  by  a  vector  pOfc)  whose  ith  element  p 
corresponds  to  arc  length  $  =  «A a  at  time  ifc  At. 
Since  p  is  yet  to  be  estimated,  there  is  no  knowledge 
about  its  arc  length.  However,  there  is  no  confusion 
as  far  as  notation  is  concerned.  Let  p**,  i  =  1,  •  •  • ,  m* 
be  the  observation  points  digitized  from  the  fcth  SST 
image.  It  is  assumed  that  the  discretization  is  fine 
enough  such  that,  up  to  some  quantization  error,  the 
arc  length,  o’,-,  associated  with  the  observed  data  p,* 
corresponds  as  <r,  =  jAs  for  some  j.  Then  their 
space-time  interpolation  is  obtained  by  that  p(s,  t) 
which  minimizes 


K  m(k) 


*=1 1=1 


+ 


+ 


dsdt 

dsdt 


(22) 


where  the  time  variable  is  discretized  as  t  =  kAt, 
k  -  and  is,-  are  the  weights  indicating 

confidence  in  the  observations;  the  constants  oi  and 
«2  control  spatial  continuity  (tension)  and  linearity 
(smoothness);  and  similarly,  the  constants  A  and 
/32  control  the  temporal  constraint  and  linearity. 
Equation  (22)  can  be  interpreted  in  terms  of  Kalman 
filter-based  equations  as; 


(£)-(i  £)  Or  )♦(:;:)  <» 

and 

/q*\  /H(p  k,k)\  /yHk\ 

(;)■(  £  Md  <24) 

where  q  is  the  observation  vector  and  H  is  the 
data-estimate  correspondence  operator  with  its  (i,  j)th 
element  defined  as 


/  1  if  Si  =  jAs 
\  0  otherwise; 


Si  and  S2  are  matrices  of  first  and  second  order 
operators;  w,  are  independent,  zero-mean,  Gaussian 
error  vectors  with  P~ll  as  covariance  matrices;  and 
vh,  vi  and  v2  are  zero-mean  Gaussian  random  error 
vectors  with  covariance  matrices  diagOf  \  •  •  • ,  j/"1), 
oj^l,  and  a^1!. 

Writing  d*  =  p*+i  -  p*,  then  eq.  (23)  can  be 
rewritten  as 

(SIMS  0(5l:0*te) 

The  Kalman  formalization  provided  by  eqs.  (24) 
and  (25)  is  the  same  in  spirit  as  that  of  Mari¬ 
ano  (1990),  except  that  it  employs  two-dimensional 
phase  speeds  and,  in  addition,  is  optimal  in  the  least 
squares  sense.  These  equations  can  be  solved  by  pro¬ 
cedures  given  in  Gelb  (1974)  and  Lewis  (1986).  How¬ 
ever,  there  is  an  inherent  difficulty  in  that  the  spa¬ 
tial  coordinates  of  the  digitized  points  are  unknown; 
thus  the  observation-data  correspondence  matrix  H 
is  also  undefined.  Also,  the  origin  of  the  spatial  co¬ 
ordinate  s  is  difficult  to  define.  The  arc  length  coor¬ 
dinates,  8i  must  be  determined  concurrently  as  the 
contours  are  interpolated. 

The  above  problems  are  alleviated  by  taking  an 
“adaptive  approach  where  best  predictions  of  the 
GSNW  contour  data  given  time  k  are  used  to  esti¬ 
mate  the  positions,  i.e.,  arc  length  indexes,  $(&),  of 
the  measurement  along  the  contour”.  An  iterative 
procedure  is  employed  where  for  each  time  index  k, 
Si(k )  and  p(k)  are  estimated  alternately  using  the 
best  guess  of  the  other.  The  observation-data  corre¬ 
spondence  matrix  H(pfc,  ifc)  in  eq.  (24)  is  evaluated 
as  the  H(py(fc  -  1),  ifc)  in  the  forward  filter  and  as 
H(pj(/fc+ 1),  k)  in  the  backward  filter  where  p j{k)  and 
p t(k)  the  forward  and  backward  filtered  estimates. 
In  other  words,  a  correspondence  is  set  up  for  the  ob¬ 
servations  at  time  k  with  the  curve  predicted  to  time 
k  from  an  estimated  contour  at  time  ifc  - 1.  The  itera¬ 
tion  procedure  was  terminated  after  convergence  or 
after  a  fixed  number  of  iterations.  Because  of  gaps 
in  measurements  at  any  time  ifc,  the  estimates  of  the 
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previous  frame,  p*_i,  are  often  the  best  guess  of  the 
general  shape  at  time  /b. 

The  correspondence  referred  to  above  is  per¬ 
formed  ‘hierarchically’,  first  using  large-scale  ‘fea¬ 
tures’,  followed  by  smaller,  local  and  then  using  in¬ 
flections  of  the  curves.  Based  on  the  interpolation  of 
150  frames,  CM93  conclude  the  following: 

•  The  algorithm,  in  general,  is  quite  successful 
in  reconstructing  features  like  the  ‘S’  and  ‘O’ 
shapes.  Fig.  3  is  an  example  of  two  such 
realizations. 

•  The  Kalman-type  formulation  provides  the  er¬ 
rors  of  estimations  as  shown  in  Fig.  3. 

•  The  algorithm  is  unable  to  resolve  some  fast 
movements  of  the  meanders  and  to  detect  trans¬ 
formations  of  the  meanders  into  rings. 

•  The  algorithm  is  dependent  on  the  digitization 
process  of  the  GS  positions  using  SST  data,  which 
in  itself  is  quite  a  laborious  process.  (The  work  of 
Cayula  and  Comillon  (1992)  should  alleviate  this 
problem  to  some  extent). 

•  The  present-day  pattern  recognition  and  match¬ 
ing  algorithms  have  yet  to  realize  flexibility  and 
sensitivity  of  trained  personnel’. 

&S  Dynamic  Interpolation 

The  dynamic  interpolation  approach  combines 
dynamic  information  from  a  numerical  model  with 
the  actual  observations  on  the  thermo-dynamic 
structure.  It  is  not  necessary  that  the  observations 
be  on  the  feature  to  be  extracted.  The  information 
content  in  an  observation  is  first  spread  in  the  neigh¬ 
boring  areas  statistically  using  spatial  and  tempo¬ 
ral  covariance  functions,  and  then  communicated 
dynamically  via  numerical  integration.  The  three- 
dimensional  numerical  model  output  is  then  sub¬ 
jected  to  a  feature  detection  algorithm. 

Briefly,  let  T  be  the  true  state  of  the  ocean,  which 
is  to  be  estimated  by  combining  the  model  output, 
Tm,  and  observations  T0.  The  vectors  T  and  Tm 


are  .V -vectors  defined  on  the  model  grid,  Gm',  the 
observation  vector,  T0  is  an  M -vector  defined  on  the 
observation  grid  Q,.  Usually,  M  «  N,  and  T0 
alone  cannot  provide  an  adequate  representation  of 
T.  Thus,  assimilation  is  performed,  resulting  in  an 
estimate  on  Gm  to  provide  initial  conditions  for  the 
numerical  integration  of  the  model  equations. 

We  assume  there  exists  a  mapping  D(£„)  = 
Q0.  Then  T„,  the  true  state  of  the  ocean  at  the 
observation  grid,  can  be  written  as  T0  =  D(T).  Often 
D  is  assumed  to  be  a  linear  mapping  so  that: 

To  =  DT.  (26) 

The  assumption  here  is  that  both  the  model 
values  and  the  observational  data  are  unbiased 
estimators  of  the  true  state  of  the  ocean,  and  that 
the  optimal  combination  of  the  two  will  lead  to  an 
estimator  of  T,  which  has  a  smaller  error  variance 
than  either  of  the  two.  Under  the  unbiasedness 
assumption,  we  can  write  the  following  linear  model: 

(SM&Ms)  <*> 

where  e™  and  e„  are  vectors  of  zero  mean  random 
errors  in  the  model  output  and  observations,  with 
£m  and  £„  as  their  respective  covariance  matrices. 
It  is  reasonable  to  assume  that  the  errors  in  T0  and 
Tm  are  statistically  independent.  Then,  the  least 
squares  solution  to  the  true  state  T  is  obtained  by 
minimizing  the  quadratic  functional: 

Q  =  (Tm  -  T YV-HTm  -  T) 

+  (T0  -  DTySjHTo  -  DT)  (28) 

where  the  prime  indicates  matrix  transposition. 
Navon  and  Liegler  (1987)  describe  several  efficient 
procedures  to  affect  the  above  minimization. 

The  estimate,  1',  of  the  true  state  of  the  ocean, 
T,  obtained  from  the  above  minimization  procedure 
is  referred  to  as  dynamical  interpolation.  It  reflects 
the  information  content  of  the  dynamical  model  and 
the  observations.  Appropriate  algorithms  can  be 
used  for  OFI.  The  modelers  in  DAMEE-GSR  used 
the  15  deg  isotherms  at  200  m  depth  algorithm 
on  dynamically  interpolated  output  to  derive  the 
GSNW  (Perkins,  1993). 
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6.6  Eddy  Detection 

Eddy  detection  from  the  labeled  edges  is  per¬ 
formed  in  either  circular  or  elliptical  shapes.  The 
algorithm  used  in  the  Navy’s  SAMaS  is  the  Hough 
transform  (Lybanon  and  Holyer,  1991),  which  can  be 
used  to  detect  lines  and  curves  in  pictures.  It  es¬ 
sentially  involves  computing  convolutions  of  the  la¬ 
beled  edges  with  various  binary  kernels  represent¬ 
ing  a  possible  shape  that  can  be  ascribed  to  the  la¬ 
beled  edges.  The  curve  selected  is  the  one  for  which 
the  kernel  shape  optimizes  the  convolution.  Thus, 
if  circles  are  to  be  assigned  as  eddy  shapes  to  the 
labeled  edges,  then  the  kernel  library  of  possible 
shapes  consists  of  circles  of  Is  of  possible  radii.  Duda 
and  Hart  (1972)  provide  a  general  description  of  the 
circular  Hough  transform.  The  solution  by  this  pro¬ 
cedure  is  equivalent  to  the  least  mean  squares  so¬ 
lution  when  the  edge  pixels  are  assumed  to  be  con¬ 
taminated  with  identically  and  independently  dis¬ 
tributed  random  noise. 

Szczecchowski  (1991)  proposed  the  Marr-Hildreth 
operator  as  an  edge  detector.  The  Marr-Hildreth  op¬ 
erator  performs  the  convolution 

E(x,  y )  «  V2G(r)  *  I(x,  y ) 

where  V2  is  the  Laplacian  operator,  *  represents  the 
convolution,  G(r)  is  the  Gaussian  function  in  polar 
coordinates 


and  I(x,  y)  is  the  two-dimensional  image  intensity. 
Tb  obtain  a  better  delineation  of  the  eddy  curve,  con¬ 
volutions  are  performed  at  more  than  one  value  of 
<r.  The  zero-crossings  of  E(x,  y)  determine  the  eddy. 
The  eddy  center  is  determined  from  the  mean  of  the 
pixel  locations  enclosed  within  the  curve.  The  Marr- 
Hildreth  operator  possesses  several  desirable  prop¬ 
erties:  It  attenuates  high  frequencies;  there  is  only 
one  free  parameter,  which  is  the  scale  parameter,  a ; 
no  thresholds  are  needed  for  edge  detection;  and,  the 
resultant  edges  form  a  closed  continuous  curve.  For 


further  details  of  the  algorithm,  see  Szczecchowski 
(1991). 

Other  algorithms  for  circle  detection  that  opti¬ 
mize  some  distance  function  between  the  center  and 
the  edge  pixels  have  been  considered  for  eddy  de¬ 
tection.  The  Thomas-Chan  algorithm  minimizes  a 
theoretical,  area-based  error  function  (Thomas  and 
Chan,  1989)  to  give  an  explicit  formula  for  the  cen¬ 
ter  and  radius  of  the  circle  in  terms  of  the  edge-pixel 
coordinates.  The  Landau  (1987)  algorithm  is  an  it¬ 
erative  algorithm  which  minimizes  the  mean  dis¬ 
tance  between  the  edge-pixels  and  the  center.  The 
Albano  (1974)  fits  a  general  conical  curve  using  the 
least  squares  formulation.  Peckinpaugh  and  Holyer 
(1994)  performed  a  relative  evaluation  of  these  and 
the  Hough  transform  circle  detection  algorithms;  the 
Thomas-Chan  algorithm  provided  the  best  overall 
accuracy  in  the  tested  examples. 

7.  TWO  GULF  STREAM  IDENTIFICATION 

SYSTEMS 

At  present,  the  surface  NW  location  is  deter¬ 
mined  subjectively  by  the  Naval  Oceanographic  Of¬ 
fice  Warfighting  Support  Center  (WSC)  using  pre¬ 
dominantly  AVHRR  data.  As  discussed  earlier,  this 
operation  may  be  greatly  hampered  because  of  cloud 
cover,  which  conceals  ocean  features.  In  cloud- 
covered  areas,  the  operator  must  make  a  completely 
subjective  estimate  of  frontal  location,  usually  based 
on  its  last  known  position.  This  procedure,  in  ad¬ 
dition  to  being  ponderous  and  time  consuming,  is 
quite  subjective.  There  obviously  is  an  urgent  need 
to  develop  an  automatic  and  objective  procedure  to 
perform  this  tedious  but  important  task.  The  Navy 
has  taken  a  lead  in  this  task  and,  in  a  systematic 
manner,  identified  and  constructed  various  compo¬ 
nents  of  feature  identification  as  a  part  of  their  Semi- 
Automated  Mesoscale  Analysis  System.  Included 
here  also  is  the  URI  system.  Although  it  is  not  as  up- 
to-date  as  SAMAS,  the  URI  algorithm  development 
seems  to  be  headed  in  the  direction  of  a  complete  sys¬ 
tem.  The  following,  although  somewhat  repetitive, 
is  provided  as  brief  and  up-to-date  information  on 
the  two  complete  systems. 
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Figure  4.  A  three-tiered  approach  to  automated  oceanographic  satellite  image  analysis.  (Adapted 
from  Holyer  and  Peckinpaugh,  1990) 


7.1  Navy's  Semi-Automated  Meso scale 
Analysis  System  (SAMAS) 

A  three-tiered  approach  of  SAMAS  is  shown  in 
Fig.  4,  and  a  functional  block  diagram  is  shown 
in  Fig.  5.  Although  the  individual  components 
have  been  described  above,  a  brief  description  of 
the  entire  system  is  given  here  for  the  sake  of 
completion. 

Image  Segmentation:  Starting  with  a  satellite  im¬ 
age,  the  digitized  gray  levels  are  converted  into  a 
gray  level  co-occurence  matrix  (GLC).  The  GLC  is 
then  used  to  compute  the  cluster  shade  measure, 
S(Ax  »  0,Ay  =  0),  in  overlapping  local  neigh¬ 
borhoods.  The  center  point  of  the  neighborhood  is 
then  replaced  by  S(Ax  =  0,  Ay  =  0).  Because  this 
measure  is  the  third  central  moment  of  the  GLC,  it 
changes  sign  whenever  there  is  a  significant  change 
in  the  distribution  of  5  over  the  neighborhood.  The 
specification  of  the  zero-crossings  then  specifies  a 
frontal  pixel. 


Feature  Labeling  and  Synthesis:  Each  frontal- 
pixel  determined  in  the  last  step  is  assigned  a  fea¬ 
ture  lable  using  the  Nonlinear  Probabilistic  Relax¬ 
ation  algorithm.  The  algorithm  assigns  different 
feature  probabilities  to  each  pixel,  which  are  im¬ 
proved  by  successive  iteration  taking  into  consider¬ 
ation  previous  analyses  and  any  new  information. 
The  pixels  under  each  feature  are  then  spatially  in¬ 
terpolated  to  construct  a  continuous  GS  front  or  a 
circular  eddy  from  fragmented  renditions  of  these 
features.  The  GS  front  is  interpolated  using  the 
complex  empirical  orthogonal  function  (CEOF)  ap¬ 
proach.  The  CEOF’s  represent  the  statistical  in¬ 
formation  extracted  from  empirical  data  on  the  GS 
front.  For  each  individual  image,  the  modal  ampli¬ 
tudes  are  computed.  A  weighted  sum  of  these  CEOF 
vectors,  with  the  modal  amplitudes  as  the  weights, 
provides  an  estimate  of  the  GS  front  at  the  present 
epoch. 
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Figure  5.  Functional  block  diagram. 


Although  not  emphasized  in  Section  6,  SAMAS 
incorporates  an  eddy  synthesis  module  also,  which 
uses  the  circular  Hough  transform  (Duda  and  Hart, 
1972).  This  method  is  capable  of  finding  the  most 
prominent  eddies  and  gives  the  center  (xe,  ye)  and 
the  radius  r  even  when  significant  parts  of  the  edge 
points  are  missing. 

Oceanographic  Expert  System:  The  expert  sys¬ 
tem  in  SAMAS  fills  in  temporal  gaps  of  the  features 
by  tracking  their  evolution  in  terms  of  the  speed,  di¬ 
rection  and  size  (in  case  of  an  eddy).  It  serves  an 
additional  function  in  that  it  provides  a  first  guess 
of  the  feature  positions  for  the  Relaxation  Labeling 
Algorithm. 

Interactive  Editors:  The  Navy  system  provides  an 
interactive  editing  capability  for  the  analyst  to  view 
the  results  and  delete  and  modify  the  features  that 
are  automatically  synthesized. 


7.2  UBTs  Feature  Detection  System 

At  present,  the  URI  feature  detection  system 
does  not  have  all  the  required  components.  How¬ 
ever,  incorporation  of  a  few  well-tested  modules  can 
make  this  system  a  good  competitor  to  the  Navy’s 
SAMAS.  Their  edge  detection  algorithm  is  fairly  well 
advanced.  Because,  the  presence  of  clouds  in  the  im¬ 
age  can  distort  the  edges,  there  is  a  major  emphasis 
on  isolation  and  removal  of  clouds  from  the  images. 
URI  has  performed  a  considerable  amount  of  analy¬ 
sis  of  the  AVHRR  satellites  images  of  the  North  At¬ 
lantic  GS  region  and  provided  the  resulting  edges  to 
the  academic  community  for  further  research.  The 
algorithm  relies  on  a  combination  of  methods  and 
operates  at  three  levels  described  below:  the  pic¬ 
ture,  the  window  and  the  local  level.  The  algorithmic 
steps  are  shown  in  Fig.  6. 


Picture  Level:  This  analysis  is  performed  at  the 
overall  picture  level.  The  main  function  at  this  level 
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Figure  6.  Flowchart  of  Cayula-Cornillon  edge  detection 
algorithm  for  SST  images.  (Adapted  from  Cayula  and 
Cornillon  (1992)) 

is  to  isolate  and  remove  the  obvious  cloudy  region. 
The  algorithm  exploits  the  physical  characteristics 
to  remove  the  clouds  in  four  steps:  (i)  Clouds  are 
colder  than  underlying  sea  surface,  so  a  rough 
thresholding  is  used  to  separate  into  non-cloudy 
and  possibly  cloudy  regions;  (ii)  Cloudy  regions  are 
characterized  by  high  gradient  magnitudes  that  are 
non-coherent.  Thus,  if  the  ratio  of  the  magnitude  of 
the  gradient  sum  to  sum  of  the  gradient  magnitudes 
is  small/large,  classify  the  region  cloudy/clear;  (iii) 
For  the  regions  with  an  in-between  ratio,  use  the  fact 
that  clouds  are  bulky  regions;  (iv)  finally,  use  a  3  x  3 
median  filter  and  remove  the  cloudy  regions 


Window  Level:  At  this  level,  the  median-filtered 
data  from  the  Picture  Level  are  examined  closely 
in  32  x  32  overlapping  windows.  Each  window 
is  processed  independently  for  the  presence  of  an 
edge.  Even  at  this  level,  minute  attention  is  paid 
to  removal  of  clouds  that  are  fine  at  this  stage.  A 
histogram  analysis  of  the  temperature  values  within 
the  window  is  performed.  In  the  presence  of  an  edge 
within  the  window,  the  histogram  will  be  bimodal, 
indicating  two  different  temperature  populations 
within  the  window.  The  bimodality  is  statistically 
tested.  This  is  followed  by  a  correlation  analysis 
to  check  whether  either  of  the  populations  could  be 
classified  as  cloudy.  If  not,  then  a  cohesion  test 
ensures  that  the  bimodality  is  actually  due  to  a  front 
and  not  due  to  a  freak  temperature  distribution.  An 
indicator  function  then  assigns  values  of  0  and  1 
according  to  which  population  the  pixel  belongs.  The 
boundary  of  Ob  and  Is  then  defines  the  edge  pixels. 

Local  Level:  At  this  level,  the  Contour-Following 
Algorithm  is  employed  to  collect  different  edge  pixels 
into  coherent  contours  that  would  represent  various 
fronts  and  eddies  in  the  mesoscale  picture.  Analysis 
beyond  this  step  requires  human  intervention,  as 
these  contours  have  not  been  assigned  any  labels. 

8.  OTHER  OFI  EFFORTS 

There  have  been  efforts  in  the  atmospheric  sci¬ 
ences  to  objectively  analyze  satellite  data  for  cloud 
coverage  and  radar  data  to  determine  gust  fronts. 
For  cloud  coverage  resolution  we  refer  to  the  re¬ 
cent  work  of  Woodberry,  Tanaka,  Hendon  and  Salby 
(1991).  Their  work  was  also  motivated  from  the 
enormity  of  the  satellite  data.  Synoptic  images  of  the 
global  cloud  cover  pattern,  composited  from  six  con¬ 
temporaneous  satellites,  provide  an  unprecedented 
view  of  the  global  cloud  field.  Having  horizontal  res¬ 
olution  of  about  0.5  deg  and  temporal  resolution  of  3 
hours,  the  global  cloud  imagery  (GCI)  resolves  most 
of  the  variability  of  organized  convection,  including 
several  harmonics  of  the  diurnal  cycle.  Although 
the  GCI  has  attractive  features,  the  dense  three- 
dimensional  nature  of  the  data  make  it  a  formidable 
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volume  of  information  to  treat  in  a  practical  and  effi¬ 
cient  manner.  They  developed  an  interactive  image 
analysis  system  to  investigate  the  space-time  vari¬ 
ability  of  the  global  cloud  cover  behavior.  Their  sys¬ 
tem  integrates  data,  hardware  and  software  into  a 
single  system  to  provide  a  variety  of  space-time  co- 
variance  analyses  in  a  menu-driven  format. 


to  multiplicatively  combine  the  influence  of  predic¬ 
tors.  Here  we  focus  on  two  methods  that  have  been 
recently  implemented  in  CAST  for  oceanographic  vi¬ 
sualization.  One  uses  a  3-D  edge  operator  algorithm 
for  feature  extraction,  while  the  other  uses  physical 
features  to  track  eddies  as  they  move  across  the  re¬ 
gion  of  interest. 


Analysts  of  radar  data  have  been  interested  in 
extracting  all  possible  information  from  radar  scans 
that,  like  satellites,  also  provide  large  amounts  of 
data.  Interactive  graphical  displays  of  these  data 
have  formed  the  basis  for  gleaning  new  knowledge 
from  the  radar  systems.  However,  for  more  practi¬ 
cal  applications  to  aircraft  safety,  there  have  been 
efforts  to  devise  an  automatic,  objective  identifica¬ 
tion  of  gust  fronts  using  real  time  radar  data.  Uyeda 
and  Zroic  ( 1986)  Automatic  Detection  of  Gust  Fronts 
is  one  such  study.  They  used  a  pattern  recognition 
algorithm  based  on  radial  convergence  of  winds  to 
determine  radial  bands  of  gust-front  activity  and  pa¬ 
rameterize  them  in  range  and  azimuth  by  fitting  the 
azimuth  as  a  quadratic  function  of  the  range. 

Similarly,  Goodrich  et  al.  (1991)  have  developed 
a  thin-line  detection  algorithm  to  detect  associated 
convergence  lines  in  the  radar  reflectivity  data.  For 
objective  featureidentification,  they  also  construct  a 
local  least  squares  quadratic  model  of  the  reflectivity 
in  terms  of  range  and  azimuth. 

9.  FRONT  LOCATION  FROM  MODEL 

OUTPUT 

Oceanic  features  like  ocean  eddies  and  fronts 
are  three-dimensional  features.  Unfortunately,  ob¬ 
servational  data  in  three  dimensions  are  not  rou¬ 
tinely  available  in  real  time  to  even  contemplate 
their  extraction.  However,  model  output  is  available 
over  time  in  three  dimensions,  and  it  is  of  consider¬ 
able  importance  to  extract  objectively  3-D  features 
or  track  their  movement  over  space  and  time. 

There  have  been  efforts  to  derive  features  from 
the  model  output.  One  recent  study  by  Fine  and 
Fraser  ( 1990)  developed  a  statistical  pattern  recog¬ 
nition  technique  called  IREW  based  on  Baye’s  rule 


9.1  Feature  Identification  With  3-D  Edge 
Operator 

A  3-D  edge  operator  was  first  used  by  Zucker  and 
Hummel  (1981)  for  feature  detection  of  CT  images 
defined  on  unit-spaced  pixels.  Moorhead  and  Zhu 
(1993)  extended  this  operator  to  work  on  3-D  data 
volumes  defined  on  Cartesian  grids  and  irregular 
grid  intervals  in  one  direction  and  applied  it  to  the 
numerical  output  of  a  Gulf  of  Mexico  model  (Dietrich 
et  al.,  1993). 


Let  fit ,  x,  y,  z)  be  a  four-dimensional  scalar  field 
defined  at  time  t  on  the  grid  point  ix,y,z).  The 
algorithm  starts  by  retaining  all  points  (x,  y,  z)  such 


that 


cr2(/(t,z,y,z))  ^ 

[£(/«,  x,j/,z))P  -  ‘ 


(29) 


where  Lt  is  a  temporal  threshold.  This  step  elim¬ 
inates  time-invariant  edges  such  as  the  coastlines 
and  noisy  data  and  reduces  computational  require¬ 
ments. 


For  the  remaining  points,  the  3-D  edge  oper¬ 
ator  algorithm  computes  the  three  Cartesian  gra¬ 
dients  and  then  takes  the  amplitude  of  their  vec¬ 
tor  sum.  The  grid  points  with  the  maximum  com¬ 
posite  gradient  amplitude  defines  a  neighborhood 
with  the  greatest  dynamic  changes  to  be  tested  for 
oceanic  front/eddy  detection.  This  edge  computation 
produces  a  companion  edge  volume  to  the  original 
data  volume  containing  clusters  of  connected  edges. 
These  edge  clusters  approximate  the  boundaries  of 
possible  features.  True  features  are  identified  based 
on  the  edges.  Obviously,  it  is  much  more  complex 
to  comprehend  3D  objects  than  2D  ones.  However, 
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Figure  7.  Positional  notation  for  3  x  3  block. (Adapted  from  Moorhead  and  Zhu  (1993)) 


in  analyzing  oceanographic  data  sets,  strong  spa¬ 
tial  correlations  are  found  between  horizontal  data 
slices  within  a  data  volume.  Therefore,  recognition 
and  reconstruction  of  3D  features  was  simplified  and 
speeded  up  by  spatially  correlating  those  features 
extracted  from  2D  data  slices.  Specifically  for  a  fea¬ 
ture  like  eddy,  these  correlations  included  functional 
values,  edge  values  and  shapes. 

The  3-D  edge  operator  acts  on  a  3  x  3  x  3 
data  block  around  the  point  under  examination. 
Using  the  notation  of  Moorhead  and  Zhu  (1993), 
at  any  time  t  the  data  in  a  three-dimensional  co¬ 
ordinate  system  is  represented  by  f(x,y,z );  the 
model  output  on  the  gridpoint,  (t  ,j,k)  is  repre¬ 
sented  as  f(xi ,  y, ,  zk )  ■  f(i,j,k).  With  this  nota¬ 
tion,  a  grid  point,  say  (1, 1, 1),  is  tested  for  a  point 
on  the  frontal  edge  as  follows.  First,  define  the 
3x3x3  data  block  for  gradient  computation  (Fig. 
7).  For  i,j,k  6  {0,1,2},  let  V( t,j,k)  represent 
the  vec  or  from  the  origin  to  the  gridpoint  ( i,j,k ), 


S (i,j,  k)  =  V(i,  j,  k)  -  V(l,  1, 1)  be  the  vector  differ¬ 
ence,  D(i,j,  k)  =  jS(t,j,  it)|  the  distance  between  the 
two  grid  points,  and 

V/(>,  j,  Jfc)  =  (/(i.j,  k )  -  /( 1, 1, 1  ))/D{i,j,  k). 

Then  the  three  gradients  are  defined  as: 

g*  ■  53  V/(i,  j,  k)[s(i,j,  k)  •  x]x, 

gy  =  V/(i,j',k)[S(i,i,*)y]y,  (30) 

g*  =  53  V/(*.  i.  k)[S(i,j,  k)  ■  z]z, 

where  x,y,z  are  unit  vectors  along  the  respective 
axes.  The  amplitude  square  of  the  overall  gradient 
is  defined  as  (jgx|2  +  |g„p  +  |g* |2). 

For  application  to  ocean  model  output,  Moor¬ 
head  and  Zhu  assumed  constant  grid  intervals,  a 
and  6,  along  the  X  and  Y  directions,  and  allowed 
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for  a  variable  interval  along  the  Z  direction  as  z.  We 
note  that  most  of  the  computational  resource  in  edge 
detection  using  the  above  gradient  algorithm  is  used 
in  computing  dot  products,  which  requires  evalua¬ 
tion  of  the  three  direction  cosines,  aG^+z2)-1/2,  Kc2+ 
z2)-1/*  and  z(c 2  +  z2)-1/2,  where  <?  =  a2  +  b2  is  a 
constant.  Ib  minimize  this  requirement,  they  (1) 
used  the  approximation,  (c2  +  z2)1/2  ~  c  +  z2/2c,  ob¬ 
tained  from  the  second-order  Taylor  series  expan¬ 
sion,  and  (2)  performed  these  computations  in  hori¬ 
zontal  slices  where  the  Z  interval  is  held  constant, 
so  that  these  square  roots  are  computed  once  for  all. 
These  two  steps  make  the  algorithm  extremely  fast. 

At  this  time,  we  note  the  following  points: 

(i)  V /(*,  j,  it)  represents  the  gradient  along  the  vec¬ 
tor  S (i,  j,  k );  the  effect  of  multiplying  it  with  the 
dot  product  term  in  the  summations,  results  in 

k )  -  /( 1, 1, 1))  cos  9,  where  0,  is  the  angle 
that  the  vector  S (i,j,k)  makes  with  the  s-axis, 
s  €  IX,  Y,  Z],  Thus,  the  terms  in  summation  are 
the  total  change  being  resolved  along  the  differ¬ 
ent  axes. 

(ii)  If  a  point  were  to  lie  in  a  volume  of  /  values 
that  are  at  most  varying  linearly  in  x,  y  and  z, 
then  g* ,  gy  and  g,  will  be  almost  negligible.  This 
is  easy  to  see  since  the  point  (1, 1, 1)  lies  in  the 
symmetric  center  of  the  3x3x3  cube,  and  for 
each  V/(i,  j,  k)  there  is  a  symmetric  difference  of 
equal  magnitude  but  opposite  sign.  If  the  point 
being  tested  were  to  lie  inside  the  eddy,  then 
for  small  grid  lengths  we  could  assume  a  linear 
variation.  On  the  other  hand  if  the  point  were 
outside  the  eddy,  it  would  be  in  volume  of  almost 
constant  /  values. 

(iii)  For  a  point  on  the  boundary  of  an  eddy,  it 
is  in  the  middle  of  two  neighborhoods,  one  of 
eddy  characteristics  and  the  other  with  non¬ 
eddy  characteristics.  The  functional  difference 
from  the  eddy  side  will  contribute  something 
that  is  stronger  than  a  linear  variation  (perhaps 
involving  a  jump),  while  the  non-eddy  side  will 


contribute  ~  0.  Thus,  the  above  formulation  is 
quite  appropriate  to  make  the  edge  (or  boundary 
points)  stand  out  from  the  rest. 

(iv)  It  may  be  possible  to  come  up  with  a  statistical 
test  to  the  null  hypothesis  of  a  grid  point,  along 
with  its  neighboring  points  in  the  3x3x3 
cube,  completely  lying  within  a  linearly-varying 
volume.  Under  this  null  hypothesis,  we  can 
form  thirteen  double  sums  of  V /(*',>,  k)  e,g., 
(V/( 2, 2, 2)  +  V/( 0, 0, 0)),  each  having  zero  mean 
and  the  same  variance.  The  number  thirteen 
results  from  the  twenty-six  k),  nine  each 

from  i  a  0,2  and  eight  from  k  =  1  (See  Fig. 
7).  Denoting  these  double  sums  by  s,  their 
mean  by  I,  and  assuming  them  to  be  statistically 
independent  (which  they  are  not,  since  each  sum 
contains  2/(1, 1, 1)  as  common),  the  statistic 


is  distributed  approximately  as  Student’s  t  with  13 
degrees  of  freedom. 

Even  without  this  test  of  hypothesis,  the  al¬ 
gorithm  is  quite  efficient  in  extracting  ocean  ed¬ 
dies  from  model  outputs.  The  algorithm  developed 
is  general  and  may  be  applied  to  various  ocean 
features  by  specifying  different  identification  crite¬ 
ria.  The  performance  of  this  algorithm  was  demon¬ 
strated  on  the  DieCAST  model  output  for  the  Gulf 
of  Mexico  region.  An  example  of  the  sharpness  of 
the  eddy  features  extracted  using  the  algorithm  is 
shown  in  Fig.  8. 

9.2  Objective  Tracking  of  an  Eddy 

In  order  to  analyze  the  thermodynamic  struc¬ 
ture  of  the  ocean  circulation  simulated  by  a  numer¬ 
ical  ocean  model,  it  is  necessary  to  sequentially  ob¬ 
serve,  in  time  and  space,  mesoscale  features  mani¬ 
fested  by  the  model  output.  Because  of  grid  dimen¬ 
sions,  as  well  as  the  frequency  of  sampling,  such  ex¬ 
aminations  are  ponderous  at  best.  The  algorithm 
developed  here  objectively  tracks  an  eddy  over  time 
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Figure  8.  An  example  of  extracted  eddies  that  are  animated  temporally  to  show  their  characteristics 
variations.  The  sequence  progresses  from  left  to  right  and  top  to  bottom  (lexographical  order). 
(Adapted  from  Zhu  et.  al.  (1993)) 


and  space,  lb  initiate  the  tracking  process,  model 
output  is  displayed  on  the  screen  so  that  the  region  of 
activity,  the  to-be-tracked  eddy,  is  identified.  From 
this  point  on,  the  algorithm  homes  in  on  the  eddy  as 
it  evolves  over  time  and  space. 

The  eddy  can  be  tracked  with  respect  to  any 
of  the  thermodynamic  parameters,  e.g.,  SSH,  SST, 
speed  of  eddy  circulation,  etc.  For  further  sophisti¬ 
cation,  one  may  utilize  the  subsurface  values.  How¬ 
ever,  the  purpose  of  this  algorithm  is  to  follow  the 
general  movement  of  the  eddy;  that  includes  a  gen¬ 
eral  neighborhood  of  the  eddy  activity  without  spec¬ 
ifying  an  exact  eddy  center. 


Objective  Tracking  Criterion:  The  derivation  of  this 
algorithm  depends  on  the  monotonic  behavior  of 
the  various  parameters  employed  in  the  tracking. 
For  instance,  if  the  eddy  is  a  cyclonic  eddy  and 
SST  is  used  to  track  its  motion,  then  the  eddy  is 
a  warm  core  eddy,  characterized  by  monotonically- 
decreasing  temperatures  as  we  move  away  from  the 
eddy  center.  If  the  temperature  of  the  eddy  surface 
is  viewed  in  three  dimensions,  it  will  look  like  an 
inverted  paraboloid  (See  Fig.  9a).  Similarly,  if 
SSH  is  used  for  tracking,  the  sea  surface  elevation 
will  be  maximum  at  the  eddy  center,  and  it  will 
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Figure  9.  A  three  dimensional  presentation  of  a  warm-core  eddy  in  terms  of  its  (a)  sea-surface 
temperature,  (b)  searsurface  height  and  (c)  surface  current  speed.  Note  that  the  minimum  current 
speed  occurs  at  the  center  of  the  eddy. 
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Figure  10.  Six  images  of  an  animated  sequence  of  two  synchronized  windows  for  eddy  tracking. 
Window  1  (upper)  is  the  horizontal  cross-section  of  the  temperature  field  with  the  cross-hair  at  the 
eddy  center  and  window  2  (lower)  is  the  vertical  (XZ)  cross-section  at  the  latitude  of  the  eddy  center 
represented  by  the  cross-hair.  (Adapted  from  Lakhamaraju  (1993)) 
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circulation  in  terms  of  SSH.  However,  if  current 
speed  is  the  criterion,  then  the  speed  at  the  eddy 
center  is  the  minimum,  and  it  increases  as  we 
move  away  from  the  center.  In  fact,  it  has  been 
observed  in  the  Gulf  of  Mexico  that  eddies  rotate 
as  a  solid  body.  A  three-dimensional  graph  of 
speed  of  eddy  currents  will  appear  like  a  parabola, 
with  its  vertex  at  the  minimum  (Fig.  9c).  Thus 
for  a  selected  parameter,  the  algorithm  invokes  its 
monotonic  behavior,  fits  the  parameter  to  a  local, 
two-dimensional  smooth  surface  as  a  function  of  X- 
longitude  and  Y-latitude,  and  finds  a  local  extreme 
point.  Because  of  the  monotonic  behavior  of  the  eddy 
circulation  parameters,  this  extreme  point  will  be 
a  maximum  for  SST  and  SSH  and  a  minimum  for 
the  current  speed.  In  any  case,  this  extreme  point 
will  determine  the  eddy  center.  Fig.  10  displays  six 
images  of  on  animated  sequence  of  two  synchronized 
windows  for  eddy  tracking. 

Least  Squares  Formulation:  Let  (x,  y)  be  the  location 
in  terms  of  (longitude,  latitude).  Then,  given  an 
approximate  starting  position  of  the  eddy  center, 
(x0,  y0),  select  all  grid  points  within  a  radius  R.  Let 
us  fit 

C  =  ax2  +  bxy  +  ey2  +  dx  +  ey  +  f  (32) 

where  C  is  the  value  of  parameter  used  to  character¬ 
ize  the  eddy  (SSH,  SST,  or  current  speed)  at  locations 
(x,  y).  Once  the  quadratic  surface  in  eq.  (32)  is  spec¬ 
ified,  we  can  find  its  extreme  point  by  equating  to 
zero  the  first  derivatives  with  respect  to  x  and  y,  i.e., 


In  fitting  the  quadratic  least  squares  surface  eq. 

(32) ,  it  will  be  necessary  to  rearrange  the  indices  so 
that  one  can  cast  eq.  (32)  in  a  standard  format: 

Y  =  Xf) 

where  Y  is  the  vector  obtained  from  rearranging 
bi  in  a  single  vector,  0  =  (A,...  A y,  with  A  = 
a,  02  =  b,  03  =  c,  &  =  d,  =  e  and  A  =  /;  X  = 
(Xfc  /)  is  n  x  6  matrix  with  the  index  k  corresponding 
to  the  observation  index  of  %  and  j  corresponding 
to  the  parameter  index  of  A  Suppose  the  index 
k  corresponds  to  the  grid  point  (i,j),  then  Xu  = 

xij  i  X*2  =  XijViji  Xfc3  =  yfj ,  X*4  =  X{j ,  XkS  —  yij 

and  X*6  =  1. 

The  following  simple  steps  are  involved.  Start 
with  initial  position  pi  =  ( x/,y/ )  and  assume  the 
initial  eddy  center  velocity  V\  =  (Vi,  Vy)  is  zero.  Say 
we  have  determined  eddy  centers  pj  =  ( Xj,yj )  at 
times  tj ,  j  =  1,  •  •  • ,  n.  The  initial  position  for  the 
next  step  is  defined  by  p/  ~  pn  +  K.ftn+1  -  t„)  at 
time  where  V„  =  with  Vi  *  0  All  grid 

points  within  the  specified  radius  R  of  (x/,y/)  are 
then  determined  and  the  two-dimensional  array  Qj 
and  ( x^, yij )  are  loaded  into  Y*  and  X*|.  A  least 
squares  algorithm  is  used  to  determine  coefficients 
0  that  are  interpreted  in  terms  of  eq.  (32).  Solve  eq. 

(33)  to  find  pn+1  =  (xn+i,  yn+j). 

10.  CONCLUDING  REMARKS 


2ax  4-  by  +  d  =  0 
bx  +  2  cy  +  e  =  0 


(33) 


Solving  eq.  (33),  we  estimate  the  eddy  center  by 


J/i 


be  —  2cd 
4oc  —  b 2 ’ 
bd  —  2ae 
4 ac  -  62 


(34) 


Assuming  that  the  time  interval  between  eddy 
tracking  is  moderate,  one  can  derive  the  next  guess 
using  the  newly-estimated  eddy  center  position 
(*i,  yi)  and  modifying  by  incorporating  the  past  in¬ 
formation  on  the  eddy  center  velocity  and  then  re¬ 
peating  the  steps  described  above. 


We  have  provided  a  brief  review  of  objective 
methods  that  have  been  used  for  identification  and 
tracking  of  mesoscale  ocean  features  like  eddies  and 
fronts.  The  North  Atlantic,  Gulf  Stream  region  has 
been  the  main  area  of  feature  identification,  primar¬ 
ily  because  of  the  well-defined,  sharp  features  of  the 
Gulf  Stream,  the  large  amounts  of  data  available 
from  a  large  number  of  experiments  conducted,  and 
because  of  Navy  interest  in  the  region.  Also,  the  re¬ 
gion  eiyoys  availability  of  large  amounts  of  remotely 
sensed,  satellite  data  providing  information  on  SST 
and  SSH.  Thus,  most  of  the  methodologies  developed 
and  reviewed  pertain  to  the  GS  region. 
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used  by  the  oceanography  community  in  its  analyses 
of  the  various  kinds  of  data.  In  our  view,  the  best 
objective  method  would  be  to  dynamically  interpo¬ 
late  the  available  information  by  appropriate  data 
assimilation  methods  and  extract  the  features  of  in¬ 
terest.  This  approach  will  take  advantage  of  all  pos¬ 
sible  data,  surface  or  sub-surface,  and  will  dynami¬ 
cally  interpolate  the  features  in  the  missing  part  of 
the  satellite  image.  The  features  can  be  extracted  by 
interpolating  for  the  specific  feature  characteristics, 
e.g.,  the  7i5,  the  15  deg  isotherm  at  200  m  depth  to 
determine  the  GSNW  or  use  the  general  edge  detec¬ 
tion  procedures  of  Section  5.  This  takes  both  data 
and  dynamics  into  consideration.  This  capability  is 
still  somewhere  in  the  future,  since  the  ocean  models 
have  not  attained  the  necessary  maturity. 

As  an  alternative,  the  four-tiered  algorithmic 
approach  fostered  by  Lybanon  and  Holyer  (1991) 
comprising  edge  detection,  edge  labeling,  feature 
construction  and  the  expert  system  is  quite  appro¬ 
priate.  The  modular  approach  adopted  in  the  Navy’s 
SAMAS  promises  to  be  an  excellent  system.  Im¬ 
provements  and  modifications  to  individual  modules 
can  be  affected  independently. 

The  edge  detection  system  based  on  the  Clus¬ 
ter  Shade  Edge  Operator  functions  well  and  seems 
to  avoid  the  noisiness  of  the  gradient-type  edge  op¬ 
erators.  It  is  based  on  the  (fun-normalized)  skew¬ 
ness  measure  of  the  gray-level  co-occurrence  matrix. 
In  evaluation  of  five  different  edge  detection  algo¬ 
rithms,  Cornillon  and  Watts  (1987)  also  found  the 
‘skew9  algorithm  to  be  the  best.  Most  analyses  per¬ 
formed  on  the  GS  front  detection  are  based  on  the 
initial  analyses  of  the  satellite  images  performed  by 
the  URI  group.  The  edge  detection  and  contour  fol¬ 
lowing  algorithm  of  Cayula  and  Cornillon  (1992)  is 
quite  sophisticated  and  is  based  on  objective,  sta¬ 
tistical  methodology  that  effectively  and  systemati¬ 
cally  eliminates  the  effect  of  clouds  in  satellite  im¬ 
ages.  However,  the  UEI  system  still  needs  to  de¬ 
velop  the  subsequent  components  of  the  four-tiered 
approach  mentioned  above. 

The  feature  labeling  algorithm  of  Krishnakumar 
et  al.  (1990a)  is  an  imperative  step  in  the  develop¬ 


ment  of  an  automatic  feature  identification  system. 
This  algorithm  is  based  on  the  Bayes  probability, 
and  it  iteratively  updates  the  probability  of  assign¬ 
ing  a  pixel  its  feature  label. 

For  spatial  interpolation,  and  for  temporal  in¬ 
terpolation  to  some  extent,  the  best  approach  would 
be  to  use  a  Kalman-type  model.  This  approach  can 
easily  incorporate  two-dimensional  phase  speeds  of 
complex  meanders  and  take  advantage  of  estima¬ 
tions  in  the  past  to  arrive  at  an  optimal  estimation  at 
the  current  epoch.  This  approach  has  the  additional 
advantage  of  providing  estimation  error  in  terms  of 
the  estimated  covariance  parameters.  But  even  for 
this  best  approach,  Chin  and  Mariano  (1993)  indi¬ 
cate  that  it  cannot  yet  realize  the  sensitivity  of  the 
trained  personnel.  Perhaps,  the  only  advantages 
over  the  manual  approach  are  the  objectivity,  con¬ 
sistency  and  savings  in  time. 

The  methods  based  on  the  data  and  statistical 
correlations  would  be  next  in  order.  The  Pathfinder 
model,  to  determine  the  mean  GS  axis  using  an  ob¬ 
jective  interpolation  approach,  correlates  the  across- 
stream  anomalies  using  a  parametric  covariance 
structure  that  is  a  function  of  the  along-stream  dis¬ 
tance.  We  were  surprised  that  this  method  did  not 
work  as  well  as  one  might  expect.  But  as  Horton 
(1982)  conjectures,  the  problem  may  be  confounded 
by  mis-specification  of  the  mean  GS  axis  and  of  the 
covariance  structure,  and  also  the  large  variance 
around  the  axis.  For  an  evaluation  of  the  Pathfinder 
and  some  other  algorithms,  see  Szczechowski  (1992). 

The  method  based  on  complex  empirical  orthog¬ 
onal  function  is  also  based  on  the  correlations  of 
the  GS  positions  along  the  axis,  but  it  avoids  pa¬ 
rameterization  of  the  covariance  function.  However, 
the  method  requires  updating  of  the  complex  ampli¬ 
tudes  based  on  few,  new  positional  fixes.  This  was 
done  using  nonlinear  least  squares  optimization. 

In  a  different  analysis  based  on  inverted  echo 
sounders  and  other  estimation  procedures,  Cornil¬ 
lon  and  Watts  (1987)  found  that  the  objective  meth¬ 
ods  were,  at  most,  as  accurate  as  the  subjective  anal¬ 
ysis  of  the  GSNW  as  defined  by  r15.  The  rms  error 
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varied  from  15  to  30  km.  Passi  et  al.  (1991)  using 
an  analysis  of  tomographic  data,  which  is  quite  ac¬ 
curate,  indicated  that  the  limit  of  prediction  of  the 
GSNW  was  about  7  km.  Such  estimation  accuracy 
appears  to  be  far  in  the  future. 

The  GSNW  meanders  are  quite  complex  in 
shape  and  often  occur  in  shapes  like  ‘S’  and  ‘O’.  It 
is  quite  obvious  that  simple  spline-type  interpola¬ 
tion  schemes,  where  the  longitude  is  a  function  of 
the  latitude  as  the  independent  variable,  could  not 
resolve  such  multivalued  meanders.  From  a  close 
examination  of  the  statistical  approaches  reviewed, 
we  notice  that  all  of  them  employ  the  arc  length  of 
the  GS  as  the  coordinate  of  reference,  which  is  quite 
an  ingenious  approach. 

Horton  (1982)  used  the  GS  climatology  as  the 
coordinate  of  reference  and  utilized  observations 
within  a  fixed  time  frame,  assigning  weights  accord¬ 
ing  to  the  age  of  the  observation,  to  perform  opti¬ 
mum  interpolation.  However,  the  algorithm  does 
not  incorporate  information  from  the  past  estima¬ 
tions. 

Molinelli  and  Flanigan  (1987)  used  individual 
axes  for  their  CEOF  estimation.  The  latitude  of 
their  starting  reference  point  varied  but  had  79*  W 
as  the  longitudinal  point.  They  employed  the  past 
estimation  of  the  GS  axis  only  as  an  initial  guess 
in  the  nonlinear  least  squares  estimation  at  the 
current  epoch.  Thus,  the  past  estimation  was 
incorporated  in  the  sense  that  it  was  modified  to 
conform  with  the  new  information. 

However,  it  was  Mariano  (1990)  who  clearly 
stated  that  a  dynamic  frame  of  reference  in  terms  of 
the  arc  length  of  the  GS  was  clearly  the  most  logical 
choice  as  the  coordinate  of  reference.  He  employed 
one-dimensional  phase  speeds  to  adjust  the  past 
and  future  information  on  the  GS  to  interpolate 
in  the  present.  Chin  and  Mariano  (1993)  took 
this  approach  to  the  next  logical  formulation  of  the 
Kalman  filtering. 

In  the  final  analysis,  the  four-tiered  approach 
adopted  by  the  Navy  in  the  development  of  SAMAS 


is  an  important  step  in  taking  advantage  of  the  ever- 
increasing  amounts  of  satellite  data.  It  is  antici¬ 
pated  that  the  system  will  be  constantly  upgraded 
as  the  new  algorithms  are  developed  and  tested. 
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of  interest.  Feature  identification  from  satellite  data  can  be  viewed  as  a  four-step  process:  (1)  edge 
detection;  (2)  edge  labeling;  (3)  spatial  interpolation;  and  (4)  expert  system.  Considerable  effort  is 
required  to  design  a  system  to  go  from  raw  satellite  imagery  to  a  finished  product  in  terms  of 
digitized  frontal  location  and  dynamics.  The  Navy’s  Semi-Automated  Mesoscale  Analysis 
System  (SAMAS)  is  perhaps  the  only  system  that  has  combined  these  steps  in  a  modular  approach, 
although  another  image  analysis  system  has  been  developed  at  the  University  of  Rhode 
Island  (URI).  Brief  descriptions  of  SAMAS  and  the  URI  systems  are  provided.  Finally,  methods 
of  feature  extraction  and  eddy  tracking  from  model  output  are  described. 
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